DESED dataset contains:DESED
Domestic Environment sound event detection;
家庭环境声音事件检测;
1. 数据
Content内容
DESED dataset contains:DESED 数据集包含:
Domestic Environment sound event detection;
家庭环境声音事件检测;
- Recorded soundscapes.录制的音景。
- Synthetic soundbank (+ code to create new soundscapes using Scaper) and dcase 2019 soundscapes.
合成音库(+使用Scaper创建新音景的代码)和 dcase 2019 音景。 - Public evaluation (recorded soundscapes) used in dcase 2019 (a.k.a. Youtube eval set in dcase, Vimeo is not available.).
dcase 2019 中使用的公开评估(录制的音景)(又名 dcase 中设置的 Youtube 评估,Vimeo 不可用。)。
Overview概述
The dataset is split into two subsets as described below.
数据集分为两个子集,如下所述。
Recorded soundscapes录制的音景
- Verified and unverfied subset of Audioset.
Audioset的已验证和未验证子集。- Unlabel_in_domain data: Unverified data have their label discarded: 14412 files.
Unlabel_in_domain data:未验证的数据的标签被丢弃: 14412个文件。 - Weakly labeled data: training data have their labels verified at the clip level: 1578 files.
弱标记数据:训练数据在剪辑级别验证了其标签: 1578 个文件。 - Validation data have their labels with time boundaries (strong labels): 1168 files.
验证数据的标签带有时间边界(强标签): 1168 个文件。 - Evaluation public files: 692 Youtube files
评估公开文件: 692 个 Youtube 文件
- Unlabel_in_domain data: Unverified data have their label discarded: 14412 files.
Synthetic soundscapes合成音景
- Background files are extracted from SINS [2], MUSAN [3] or Youtube and have been selected because they contain a very low amount of our sound event classes.
背景文件是从 SINS [2] 、MUSAN [3]或 Youtube 中提取的,之所以被选择是因为它们包含的声音事件类数量非常少。 - Foreground files are extracted from Freesound [4][5] and manually verified to check the quality and segmented to remove silences.
前景文件从 Freesound [4] [5]中提取,并手动验证以检查质量并分段以消除静音。 - Mixtures are described in Generating new synthetic data.
生成新的合成数据中描述了混合物。 - Sound bank: 声音库:
- Training: 2060 background files (SINS) and 1009 foreground files (Freesound).
训练: 2060 个背景文件(SINS)和_1009 个前台文件_(Freesound)。 - Eval: 12 (Freesound) + 5 (Youtube) background files and 314 foreground files (Freesound).
评估: 12 (Freesound) + 5 (Youtube)后台文件_和_314 个前台文件(Freesound)。
- Training: 2060 background files (SINS) and 1009 foreground files (Freesound).
Bibliography参考书目
You can find information about this dataset in these papers:
您可以在这些论文中找到有关此数据集的信息:
- Turpault et al. Description of DESED dataset + official results of DCASE 2019 task 4.
图尔波特等人。 DESED数据集描述+DCASE 2019任务4的官方结果。 - Serizel et al. Robustness of DCASE 2019 systems on synthetic evaluation set.
塞里泽尔等人。 DCASE 2019 系统在综合评估集上的鲁棒性。
Relation to DCASE task 4
与 DCASE 任务 4 的关系
If you want more information about dcase 2019 dataset go to Desed for DCASE 2019 task 4 below, or visit DCASE 2019 task 4 web page
如果您想了解有关 dcase 2019 数据集的更多信息,请参阅下面的Desed for DCASE 2019 任务 4 ,或访问DCASE 2019 任务 4 网页
2. 数据片段
This page explains how to download the recorded clips. The training + validition sets are downloaded separately from the public evaluation set.
本页介绍如何下载录制的剪辑。训练+验证集是与公共评估集分开下载的。
Training set and validation set
训练集和验证集
- The real_data folder real_data 文件夹
- Clone this repo克隆这个仓库
cd real_data/src
python download_real_data.py
- Send a mail with the csv files in the
real_data/missing_files
folder to nicolas (and romain)
将real_data/missing_files
文件夹中包含 csv 文件的邮件发送给nicolas (和romain ) - If you want to do the dcase2019 repo, launch
create_dcase2019_dataset.sh
fromreal_data
folder
如果您想执行 dcase2019 存储库,请从real_data
文件夹启动create_dcase2019_dataset.sh
Public evaluation set公开评价集
The evaluation data are in the following repo: DESED_public_eval.
评估数据位于以下存储库中: DESED_public_eval 。
It corresponds to “youtube” subset in the desed eval paper and in the task 4 of DCASE 2019 Challenge.
它对应于设计评估论文和 DCASE 2019 挑战赛任务 4中的“youtube”子集。
- Download DESED_public_eval.tar.gz
下载 DESED_public_eval.tar.gz tar -xzvf DESED_public_eval.tar.gz
- To move it to dcase2019, merge
dataset/
withdcase2019/dataset
.
要将其移动到 dcase2019,请将dataset/
与dcase2019/dataset
合并。
Class-wise statistics按类别统计
| | Training (weak)训练(弱) | Validation验证 | Public Evaluation公众评价 |
| | clips剪辑 | clips剪辑 | events事件 | clips剪辑 | events事件 |
| Alarm/bell/ringing闹钟/铃声/响铃 | 205 | 187 | 420 | 79 | 196 |
| Blender混合器 | 134 | 80 | 96 | 73 | 84 |
| Cat猫 | 173 | 121 | 341 | 70 | 240 |
| Dishes菜肴 | 184 | 171 | 567 | 136 | 488 |
| Dog狗 | 214 | 160 | 570 | 82 | 441 |
| Electric shaver/toothbrush
电动剃须刀/牙刷 | 103 | 62 | 65 | 84 | 108 |
| Frying煎炸 | 171 | 89 | 94 | 88 | 90 |
| Running water自来水 | 343 | 197 | 237 | 92 | 109 |
| Speech演讲 | 550 | 627 | 1754 | 314 | 913 |
| Vacuum cleaner吸尘器 | 167 | 91 | 92 | 94 | 96 |
| Total全部的 | 1578 | 1168 | 4093 | 692 | 2765 |
ref
https://project.inria.fr/desed/description/