LipLearner: Customizable Silent Speech Interactions on Mobile Devices

Su, Zixiong; Fang, Shitao; Rekimoto, Jun

doi:10.1145/3544548.3581465

Computer Science > Human-Computer Interaction

arXiv:2302.05907 (cs)

[Submitted on 12 Feb 2023 (v1), last revised 5 Mar 2023 (this version, v3)]

Title:LipLearner: Customizable Silent Speech Interactions on Mobile Devices

Authors:Zixiong Su, Shitao Fang, Jun Rekimoto

View PDF

Abstract:Silent speech interface is a promising technology that enables private communications in natural language. However, previous approaches only support a small and inflexible vocabulary, which leads to limited expressiveness. We leverage contrastive learning to learn efficient lipreading representations, enabling few-shot command customization with minimal user effort. Our model exhibits high robustness to different lighting, posture, and gesture conditions on an in-the-wild dataset. For 25-command classification, an F1-score of 0.8947 is achievable only using one shot, and its performance can be further boosted by adaptively learning from more data. This generalizability allowed us to develop a mobile silent speech interface empowered with on-device fine-tuning and visual keyword spotting. A user study demonstrated that with LipLearner, users could define their own commands with high reliability guaranteed by an online incremental learning scheme. Subjective feedback indicated that our system provides essential functionalities for customizable silent speech interactions with high usability and learnability.

Comments:	Conditionally accepted to the ACM CHI Conference on Human Factors in Computing Systems 2023 (CHI '23)
Subjects:	Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2302.05907 [cs.HC]
	(or arXiv:2302.05907v3 [cs.HC] for this version)
	https://doi.org/10.48550/arXiv.2302.05907
Related DOI:	https://doi.org/10.1145/3544548.3581465

Submission history

From: Zixiong Su [view email]
[v1] Sun, 12 Feb 2023 13:10:57 UTC (18,207 KB)
[v2] Tue, 14 Feb 2023 07:56:45 UTC (18,207 KB)
[v3] Sun, 5 Mar 2023 07:58:44 UTC (18,593 KB)

Computer Science > Human-Computer Interaction

Title:LipLearner: Customizable Silent Speech Interactions on Mobile Devices

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Human-Computer Interaction

Title:LipLearner: Customizable Silent Speech Interactions on Mobile Devices

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators