VR/AR/MR（统称为XR）有望成为个人电脑、手机之后的下一代计算平台，元宇宙的硬件入口。

近年来XR发展迅猛，不仅可以应用于游戏、影视领域，还在军事、医疗、办公、教育等领域有巨大的发展前景。然而相比于XR强大的显示技术，其文字输入技术在效率和灵活性方面还远远落后。

文字输入是计算机的基本功能。个人电脑的打字靠物理键盘，手机的打字靠触摸屏软键盘。XR要成为普适的计算平台，必然要解决文本输入这一基本需求。人机交互研究者已经开发了多种面向XR的打字技术，但各有其不足，目前尚未出现主流的输入技术。下面分类介绍几种代表性的XR文字输入技术。

基于外设的手部输入

物理键盘

利用人们对于物理键盘的熟悉度，研究者将物理QWERTY键盘或其变体应用于XR中作为输入设备。但由于在XR应用中，通常无法（或不方便）看到物理键盘和手，一些研究者提出将键盘和手的视频混合到显示中的方法，还有研究者利用光学跟踪技术来跟踪键盘和手，以提供视觉反馈。

Knierim等（2018）

这些方法通常支持快速和丰富的文本输入，但物理键盘需要一个平坦的表面作为支撑，而且其携带起来太笨重，这限制了XR的使用。此外，物理键盘大多借助跟踪技术来获得更好的打字性能，这在实现上也有困难。

传统手柄

电视遥控器、传统游戏手柄等可以实现简单的文字输入，使用者需要通过多次离散的按键来选择字母或符号。这些传统手柄的便携性比全尺寸物理键盘好，也不需要支撑平面。但是这种方式只能实现最简单的文字输入，而且多次离散的按键会导致输入速度非常的慢。

此外，研究者将虚拟键盘划分为多个区域进行打字实验。比如，PizzaText使用游戏控制器的两个控制杆在圆形键盘布局中进行选择，Min等人则将键盘布局划分为3*3个单元格，通过指向单元格来选择按键。

PizzaText（Yu等，2018）

虽然手柄在XR中打字很容易学习，但这类方法输入速度很慢，主要适用于难度较低的按键选择等任务，不适合在办公等应用中进行复杂、大量的文本输入。

触摸屏

触摸屏或触控板上的文本输入，将熟悉的手机交互带入虚拟现实。比如BlindType探索了用一个拇指在触控板上盲打字，用户在一个有想象的QWERTY键盘的屏幕上敲击，同时在显示器中接受文本的反馈。这类方法通常将虚拟界面与触摸屏（比如智能手机）对齐来进行输入。

BlindType（Lu等，2017）

然而这类方法通常需要额外的大尺寸触摸屏或触摸板作为输入设备，这导致XR使用上的不方便。此外，触摸屏打字基于间接的视觉反馈，需要额外的注意跟踪当前的状态，相比于触觉反馈并不直观有效。

6DoF手柄

多数消费级的VR头显，比如Oculus Quest、Pico等，都支持利用手柄指示的射线来选择字母。这种输入方式很直观、易学，但是这种打字方式需要持续的视觉注意力和精细的手柄控制，用户长时间打字会有明显的疲倦感，因而它也并不适用于复杂、大量的文本输入任务。

利用头戴摄像头跟踪手

一些VR/AR眼镜（比如Oculus Quest、HoloLens）利用基于视觉的手势跟踪技术追踪用户的手，打字时一个虚拟键盘将被投射到用户视野中，用户的手指被眼镜中的摄像头实时跟踪，并通过手指在虚拟键盘上的“触控”实现输入。

HoloLens2的虚拟键盘

这种方式通过内置头戴摄像头实现类似于手机中的触控打字，但在有遮挡或者手指快速运动时可能出现误差，其使用的光学跟踪技术也相对昂贵。此外，长时间悬空操作容易导致用户疲劳。

外部相机

Vulture（Markussen等，2014）

佩戴在身体上或外部的相机可以用来识别手部的动作或手势，从而实现XR打字。TypeNet等方法通过光学跟踪实现在相机前的平面上快速打字，这些方法需要在虚拟的QWERTY键盘上打字，延续用户的打字习惯。OmniTouch是一款肩戴式深度感应和投影系统，可以在手掌等平面上进行文本输入。其他方法通过识别手势实现了悬空的文本输入，比如Vulture通过跟踪捏紧的手指，根据在空中手指的运动轨迹输出最佳的匹配单词。

基于外部相机的文本输入方法不依赖物理键盘，使得文本输入更加灵活。然而，研究表明：(1)用户在快速运动时可能会因跟踪误差而限制速度，(2)长时间手悬空打字后会导致用户疲劳。由于基于相机的手指跟踪仍然是一个具有挑战性的问题，这些研究大多使用昂贵的光学跟踪技术，这导致这类方法实用性不足。和VR眼镜的手势跟踪方法相比，基于外部的相机实现的打字方法更不方便，因为需要额外携带体积较大的跟踪设备。

可穿戴的手部输入

这里指利用戴在手上的传感器进行打字。

手套

手套作为可穿戴的交互式传感设备是实现文本输入的一种途径。KITTY和DigiTouch利用手套上的电子触电或部分导电区域来检测手指触摸事件，实现复杂的手指输入交互。Argot是一款有15个按钮的单手可穿戴手套，通过磁反应结合触摸反馈实现打字交互。

然而这些打字方式均要求用户戴手套，这会影响用户日常交互，在灵活性和舒适性上存在缺陷。

Pinch Keyboard（Kuester等，2005）

手腕

另一种可穿戴式文本输入方法是佩戴有不同传感器的腕戴设备。PalmType使用左手手腕上的红外传感器来检测用户右手食指在左手手掌上的位置。BlueTap和DigiTap将字母按顺序映射到手指上，并使用手腕上的摄像头来检测敲击。当存在手指遮挡和快速敲击时，腕戴式相机很难准确检测敲击位置。还有基于MYO肌电传感器的打字技术（被Meta收购）。

PalmType（Wang等，2015）

ViFin识别悬空手指的书写轨迹实现打字，它使用智能手表的惯性测量单元（IMU）来检测用户食指运动时的振动情况，并利用深度网络对食指书写的字符进行解码，其计算量较大，且ViFin的识别精度相对较低。TapType使用两个带有IMU的无线腕带（TapID技术中首次引入）来感知手指在平面上轻触产生的振动，并用解码器对十指打字的字符序列进行估计。TapType作为一种基于学习的手指识别方法，需要以有监督的方式训练分类器，在不同用户和表面材料上可能存在泛化问题。

TapType（Streli等，2022）

手指

佩戴在手指上的设备使用起来灵活轻便，可以通过感应手指旋转/触摸事件和识别手势来实现富有表现力的文本输入交互。例如，FingeRing基于手指敲击事件与加速度计的组合生成键的映射，并允许在任何表面（如腰部或大腿）上输入。类似地，TypingRing和QwertyRing通过不同的传感器衡量手指的运动，并在任何类似桌面的表面上识别虚拟键盘上选定的键。TypeAnywhere使用两个Tap Strap（商用的基于IMU的手指穿戴设备）来检测任何表面上的敲击。RotoSwype使用一个运动跟踪传感器和一个带按钮的贴环实现基于手势的文字输入。

TypeAnywhere（Zhang等，2022）

基于手势的方法缺乏足够的键来快速输入，因此这些方法不适合复杂的文本输入。基于加速度等运动信息或手的旋转来感知按键事件的设备，通常需要辅助传感信息来确保准确的输入。在这些方法中，文本输入的操作需要非常精确，这对于初学者来说并不直观，在XR环境中很难学习。此外，上述技术大多采用统计解码器来提高输入速度和纠错，对于更大符号集等打字任务缺乏有效的输入方式。

除了通过检测手指的运动信息来打字外，通过感知身体的触摸事件来输入文本的方法也被广泛研究。TipText，BiTipText和ThumbText在手指上戴微型的触摸传感器，在两次触摸事件后选择字符。FingerText和FingerT9利用电容感应技术，通过拇指和手指之间的手内触摸来实现单手文本输入。

TipText（Xu等，2019）

在上述基于触摸传感器的技术中，按键键被映射到手指、手掌或指甲上的不同区域。这些方法充分利用了手指的灵活性。然而，手指打字的输入区域通常受到很大限制（如果面积过大，就有手套的缺点），从而限制了按键的数量，导致打字效率低下。事实上，它们中很少支持全字符集输入。

为解决这个问题，PrinType利用戴在拇指上的的指纹传感器来识别手指的不同区域，大大扩展了输入空间，同时不影响手指执行其他任务。PrinType通过将当前图像与注册模板匹配来识别传感器接触的位置，手指的不同区域被分配给不同的键，在虚拟现实中构成一个功能齐全的键盘。由于手指的灵活性和指纹识别技术的有效性，手指掌侧的大部分位置都可以被触摸和识别，这适应了典型键盘中包含的大量的离散按键的需求。利用指尖灵敏的触摸感知能力和实时的视觉反馈，PrinType可以实现虚拟现实中的盲打字。

PrinType指纹打字技术支持大量符号集，包括大小写字母、数字、标点符号等。（Liu等，2022）

其他技术

XR的头显中嵌入各种传感器，可以测量头部和眼睛的运动，或记录声音。语音识别的发展使得语音转录文字成为一种成熟的文本输入方法，然而在某些场合下，说话并不方便，并且语音输入缺乏有效的文字编辑技术。Yu等人和Speicher等人比较了基于头部指向的文本输入技术，Ma等人和Rajanna等人研究了VR中的凝视打字方法。然而，在这些方法中，用户需要将光标移动到目标位置，然后通过视线一定的时间停留在该字符上来选择字符，从而导致输入时眼睛的疲劳。

总结

XR中高效的文字输入是尚未解决的问题。一项文字输入技术要想在XR中流行，有许多需要考量的指标，除了需要在盲操作时有快速而准确的输入潜力，该打字方法还要易学习、便携性好、造价不能过于昂贵等。最重要的是，要能很好的嵌入XR系统，在交互时足够方便舒适、有很好的用户体验感。

参考文献

Aakar Gupta, Cheng Ji, Hui-Shyong Yeo, Aaron Quigley, and Daniel Vogel. 2019. RotoSwype: Word-Gesture Typing Using a Ring. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (pp. 1–12).
Anna Peshock, Julia Duvall, and Lucy E. Dunne. (2014). Argot: A Wearable One-Handed Keyboard Glove. In Proceedings of the 2014 ACM International Symposium on Wearable Computers: Adjunct Program (pp. 87–92).
Ben Maman and Amit Bermano. (2022). TypeNet: Towards Camera Enabled Touch Typing on Flat Surfaces through Self-Refinement. In 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 567–576).
Chris Harrison, Hrvoje Benko, and Andrew D. Wilson. (2011). OmniTouch: Wearable Multitouch Interaction Everywhere. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (pp. 441–450).
Chun Yu, Yizheng Gu, Zhican Yang, Xin Yi, Hengliang Luo, and Yuanchun Shi. (2017). Tap, Dwell or Gesture? Exploring Head-Based Text Entry Techniques for HMDs. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 4479–4488).
Dash, S. (2017). BlueTap—The Ultimate Virtual-Reality (VR) Keyboard.
DoYoung Lee, Jiwan Kim, and Ian Oakley. (2021). FingerText: Exploring and Optimizing Performance for Wearable, Mobile and One-Handed Typing. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 283-298).
Dube, T. J., & Arif, A. S. (2019, July). Text entry in virtual reality: A comprehensive review of the literature. In International Conference on Human-Computer Interaction (pp. 419-437). Springer, Cham.
Eric Whitmire, Mohit Jain, Divye Jain, Greg Nelson, Ravi Karkar, Shwetak Patel, and Mayank Goel. (2017). DigiTouch: Reconfigurable Thumb-to-Finger Input and Text Entry on Head-Mounted Displays. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(3), 113-134.
Falko Kuester, Michelle Chen, Mark E. Phair, and Carsten Mehring. (2005). Towards Keyboard Independent Touch Typing in VR. In Proceedings of the ACM Symposium on Virtual Reality Software and Technology (pp. 86–95).
Junhyeok Kim, William Delamare, and Pourang Irani. (2018). ThumbText: Text Entry for Wearable Devices Using a Miniature Ring. In Proceedings of the 44th Graphics Interface Conference (pp. 18–25).
Kim, J., Delamare, W., & Irani, P. (2018, May). Thumbtext: Text entry for wearable devices using a miniature ring. In Graphics Interface.
Knierim, P., Schwind, V., Feit, A. M., Nieuwenhuizen, F., & Henze, N. (2018, April). Physical keyboards in virtual reality: Analysis of typing performance and effects of avatar hands. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (pp. 1-9).
Kyungha Min. (2011). Text Input Tool for Immersive VR Based on 3×3 Screen Cells. In Proceedings of the 5th International Conference on Convergence and Hybrid Information Technology (pp. 778–786). Springer-Verlag, Berlin, Heidelberg.
Lee, D., Kim, J., & Oakley, I. (2021, May). Fingertext: Exploring and optimizing performance for wearable, mobile and one-handed typing. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-15).
Liu, Z., He, J., Feng, J., & Zhou, J. (2022). PrinType: Text Entry via Fingerprint Recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 6(4), 1-31.
Lu, Y., Yu, C., Yi, X., Shi, Y., & Zhao, S. (2017). Blindtype: Eyes-free text entry on handheld touchpad by leveraging thumb’s muscle memory. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(2), 1-24.
Manuel Prätorius, Dimitar Valkov, Ulrich Burgbacher, and Klaus Hinrichs. (2014). DigiTap: An Eyes-Free VR/AR Symbolic Input Device. In Proceedings of the 20th ACM Symposium on Virtual Reality Software and Technology (pp. 9–18).
Marco Speicher, Anna Maria Feit, Pascal Ziegler, and Antonio Krüger. (2018). Selection-Based Text Entry in Virtual Reality. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (pp. 1–13).
Markussen, A., Jakobsen, M. R., & Hornbæk, K. (2014, April). Vulture: a mid-air word-gesture keyboard. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1073-1082).
Masaaki Fukumoto and Yasuhito Suenaga. (1994). “FingeRing”: A Full-Time Wearable Interface. In Conference Companion on Human Factors in Computing Systems (pp. 81–82).
Nirjon, S., Gummeson, J., Gelb, D., & Kim, K. H. (2015, May). Typingring: A wearable ring platform for text input. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services (pp. 227-239).
Pui Chung Wong, Kening Zhu, and Hongbo Fu. (2018). FingerT9: Leveraging Thumb-to-Finger Interaction for Same-Side-Hand Text Entry on Smartwatches. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (pp. 1–10).
Shahriar Nirjon, Jeremy Gummeson, Dan Gelb, and Kyu-Han Kim. (2015). TypingRing: A Wearable Ring Platform for Text Input. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services (pp. 227–239).
Speicher, M., Feit, A. M., Ziegler, P., & Krüger, A. (2018, April). Selection-based text entry in virtual reality. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (pp. 1-13).
Streli, P., Jiang, J., Fender, A. R., Meier, M., Romat, H., & Holz, C. (2022, April). TapType: Ten-finger text entry on everyday surfaces via Bayesian inference. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (pp. 1-16).
Vijay Rajanna and John Paulin Hansen. (2018). Gaze Typing in Virtual Reality: Impact of Keyboard Design, Selection Method, and Motion. In Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications (pp. 15-25).
Wang, C. Y., Chu, W. C., Chiu, P. T., Hsiu, M. C., Chiang, Y. H., & Chen, M. Y. (2015, August). PalmType: Using palms as keyboards for smart glasses. In Proceedings of the 17th International Conference on Human-Computer Interaction with Mobile Devices and Services (pp. 153-160).
Wenqiang Chen, Lin Chen, Meiyi Ma, Farshid Salemi Parizi, Shwetak Patel, and John Stankovic. (2021). ViFin: Harness Passive Vibration to Continuous Micro Finger Writing with a Commodity Smartwatch. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 5(1), 45-70.
Wong, P. C., Zhu, K., & Fu, H. (2018, April). Fingert9: Leveraging thumb-to-finger interaction for same-side-hand text entry on smartwatches. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (pp. 1-10).
Xinyao Ma, Zhaolin Yao, Yijun Wang, Weihua Pei, and Hongda Chen. (2018). Combining Brain-Computer Interface and Eye Tracking for High-Speed Text Entry in Virtual Reality. In 23rd International Conference on Intelligent User Interfaces (pp. 263–267).
Xu, Z., Chen, W., Zhao, D., Luo, J., Wu, T. Y., Gong, J., … & Yang, X. D. (2020, April). Bitiptext: Bimanual eyes-free text entry on a fingertip keyboard. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-13).
Xu, Z., Wong, P. C., Gong, J., Wu, T. Y., Nittala, A. S., Bi, X., … & Yang, X. D. (2019, October). Tiptext: Eyes-free text entry on a fingertip keyboard. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology(pp. 883-899).
Yizheng Gu, Chun Yu, Zhipeng Li, Zhaoheng Li, Xiaoying Wei, and Yuanchun Shi. (2020). QwertyRing: Text Entry on Physical Surfaces Using a Ring. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4(4), 128-157.
Yu, D., Fan, K., Zhang, H., Monteiro, D., Xu, W., & Liang, H. N. (2018). PizzaText: text entry for virtual reality systems using dual thumbsticks. IEEE transactions on visualization and computer graphics, 24(11), 2927-2935.
Zhang, M. R., Zhai, S., & Wobbrock, J. O. (2022, April). TypeAnywhere: A QWERTY-Based Text Entry Solution for Ubiquitous Computing. In CHI Conference on Human Factors in Computing Systems (pp. 1-16).
Zheer Xu, Pui Chung Wong, Jun Gong, Te-Yen Wu, Aditya Shekhar Nittala, Xiaojun Bi, Jürgen Steimle, Hongbo Fu, Kening Zhu, and Xing-Dong Yang. (2019). TipText: Eyes-Free Text Entry on a Fingertip Keyboard. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology (pp. 883–899).
Zheer Xu, Weihao Chen, Dongyang Zhao, Jiehui Luo, Te-Yen Wu, Jun Gong, Sicheng Yin, Jialun Zhai, and Xing-Dong Yang. (2020). BiTipText: Bimanual Eyes-Free Text Entry on a Fingertip Keyboard. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1–13).