Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the betterdocs domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the jnews-view-counter domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wp-statistics domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wpdiscuz domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114

Notice: 函数 _load_textdomain_just_in_time 的调用方法不正确jnews 域的翻译加载触发过早。这通常表示插件或主题中的某些代码运行过早。翻译应在 init 操作或之后加载。 请查阅调试 WordPress来获取更多信息。 (这个消息是在 6.7.0 版本添加的。) in /data/user/htdocs/wp-includes/functions.php on line 6114

Notice: 函数 _load_textdomain_just_in_time 的调用方法不正确jnews-like 域的翻译加载触发过早。这通常表示插件或主题中的某些代码运行过早。翻译应在 init 操作或之后加载。 请查阅调试 WordPress来获取更多信息。 (这个消息是在 6.7.0 版本添加的。) in /data/user/htdocs/wp-includes/functions.php on line 6114

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893
{"id":2985,"date":"2023-06-18T10:46:58","date_gmt":"2023-06-18T02:46:58","guid":{"rendered":"https:\/\/linguaresources.com\/?p=2985"},"modified":"2023-06-18T11:55:42","modified_gmt":"2023-06-18T03:55:42","slug":"speechgen%ef%bc%9a%e7%94%a8%e6%8f%90%e7%a4%ba%e8%a7%a3%e9%94%81%e8%af%ad%e9%9f%b3%e8%af%ad%e8%a8%80%e6%a8%a1%e5%9e%8bspeech-lm%e7%9a%84%e7%94%9f%e6%88%90%e8%83%bd%e5%8a%9b","status":"publish","type":"post","link":"https:\/\/linguaresources.com\/?p=2985","title":{"rendered":"SpeechGen\uff1a\u7528\u63d0\u793a\u89e3\u9501\u8bed\u97f3\u8bed\u8a00\u6a21\u578b(Speech LM)\u7684\u751f\u6210\u80fd\u529b"},"content":{"rendered":"

\n \n<\/p>\n

\n  <\/strong><\/span><\/strong><\/em> <\/strong><\/span><\/strong><\/em><\/span>\u8bba\u6587\u94fe\u63a5\uff1a<\/span><\/strong>\n<\/p>\n

\n https:\/\/arxiv.org\/pdf\/2306.02207.pdf<\/span>
\n<\/section>\n

\n  <\/strong><\/span><\/strong><\/em> <\/strong><\/span><\/strong><\/em><\/span>Demo:<\/span><\/strong>\n<\/p>\n

\n https:\/\/ga642381.github.io\/SpeechPrompt\/speechgen.html<\/span>
\n<\/section>\n

\n  <\/strong><\/span><\/strong><\/em> <\/strong><\/span><\/strong><\/em><\/span>Code:<\/span><\/strong>\n<\/p>\n

\n https:\/\/github.com\/ga642381\/SpeechGen<\/span>\n<\/p>\n

\n \n<\/p>\n

\n \n<\/p>\n

\n
\n
\n
\n
\n
\n <\/section>\n<\/section>\n
\n
\n
\n \u5f15\u8a00\u4e0e\u52a8\u673a<\/strong><\/span><\/span>
\n <\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n

\n \u5927\u578b\u8bed\u8a00\u6a21\u578b \uff08LLMs\uff09\u5728\u4eba\u5de5\u667a\u80fd\u751f\u6210\u5185\u5bb9\uff08AIGC\uff09\u65b9\u9762\u5f15\u8d77\u4e86\u76f8\u5f53\u5927\u7684\u5173\u6ce8\uff0c\u7279\u522b\u662f\u968f\u7740 ChatGPT \u7684\u51fa\u73b0\u3002<\/span>\n<\/p>\n

\n
<\/span>\n<\/p>\n

\n \u7136\u800c\uff0c\u5982\u4f55\u7528\u5927\u578b\u8bed\u8a00\u6a21\u578b\u5904\u7406\u8fde\u7eed\u8bed\u97f3\u4ecd\u7136\u662f\u4e00\u4e2a\u672a\u89e3\u51b3\u7684\u6311\u6218\uff0c\u8fd9\u4e00\u6311\u6218\u963b\u788d\u4e86\u5927\u578b\u8bed\u8a00\u6a21\u578b\u5728\u8bed\u97f3\u751f\u6210\u65b9\u9762\u7684\u5e94\u7528\u3002<\/span>\n<\/p>\n

\n
<\/span>\n<\/p>\n

\n \u56e0\u4e3a\u8bed\u97f3\u4fe1\u53f7\u5305\u542b\u4e30\u5bcc\u7684\u4fe1\u606f\uff0c\u5305\u62ec\u8bf4\u8bdd\u8005\u548c\u60c5\u611f\uff0c\u8d85\u8d8a\u4e86\u7eaf\u6587\u672c\u6570\u636e\uff0c\u57fa\u4e8e\u8bed\u97f3\u7684\u8bed\u8a00\u6a21\u578b \uff08Speech Language Model, Speech LM\uff09\u4e0d\u65ad\u6d8c\u73b0\u3002<\/span>\n<\/p>\n

\n
<\/span>\n<\/p>\n

\n \u867d\u7136\u4e0e\u57fa\u4e8e\u6587\u672c\u7684\u8bed\u8a00\u6a21\u578b\u76f8\u6bd4\uff0c\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u4ecd\u5904\u4e8e\u65e9\u671f\u9636\u6bb5\uff0c\u4f46\u7531\u4e8e\u8bed\u97f3\u6570\u636e\u4e2d\u8574\u542b\u7740\u6bd4\u6587\u672c\u66f4\u4e30\u5bcc\u7684\u4fe1\u606f\uff0c\u5b83\u4eec\u5177\u5907\u5de8\u5927\u7684\u6f5c\u529b\uff0c\u4ee4\u4eba\u5145\u6ee1\u671f\u5f85\u3002<\/span>\n<\/p>\n

\n          
<\/span>\n<\/p>\n

\n \u7814\u7a76\u4eba\u5458\u4eec\u6b63\u79ef\u6781\u63a2\u7d22\u63d0\u793a \uff08prompt\uff09 \u8303\u5f0f\u7684\u6f5c\u529b\uff0c\u4ee5\u53d1\u6325\u9884\u8bad\u7ec3\u8bed\u8a00\u6a21\u578b\u7684\u80fd\u529b\u3002\u8fd9\u79cd\u63d0\u793a\u901a\u8fc7\u5fae\u8c03\u5c11\u91cf\u53c2\u6570\uff0c\u5f15\u5bfc\u9884\u8bad\u7ec3\u8bed\u8a00\u6a21\u578b\u505a\u7279\u5b9a\u7684\u4e0b\u6e38\u4efb\u52a1\u3002\u8fd9\u79cd\u6280\u672f\u56e0\u5176\u9ad8\u6548\u548c\u6709\u6548\u800c\u5728NLP\u9886\u57df\u5907\u53d7\u9752\u7750\u3002\u5728\u8bed\u97f3\u5904\u7406\u9886\u57df\uff0cSpeechPrompt\u5c55\u793a\u51fa\u4e86\u5728\u53c2\u6570\u6548\u7387\u65b9\u9762\u7684\u663e\u8457\u6539\u8fdb\uff0c\u5e76\u5728\u5404\u79cd\u8bed\u97f3\u5206\u7c7b\u4efb\u52a1\u4e2d\u53d6\u5f97\u4e86\u7ade\u4e89\u6027\u7684\u8868\u73b0\u3002<\/span>\n<\/p>\n

\n
<\/span>\n<\/p>\n

\n \u7136\u800c\uff0c\u63d0\u793a\u80fd\u5426\u5e2e\u52a9\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u505a\u751f\u6210\u4efb\u52a1\u4ecd\u662f\u672a\u89e3\u4e4b\u8c1c\u3002\u5728\u672c\u6587\u4e2d\uff0c\u6211\u4eec\u63d0\u51fa\u4e00\u4e2a\u521b\u65b0\u7684\u7edf\u4e00\u6846\u67b6\uff0cSpeechGen\uff0c\u65e8\u5728\u6fc0\u53d1\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u8fdb\u884c\u751f\u6210\u4efb\u52a1\u7684\u6f5c\u529b\u3002\u5982\u4e0b\u56fe\u6240\u793a\uff0c\u5c06\u4e00\u6bb5\u8bed\u97f3\u3001\u4e00\u4e2a\u7279\u5b9a\u7684\u63d0\u793a\uff08prompt\uff09\u5582\u7ed9 speech LM \u4f5c\u4e3a\u8f93\u5165\uff0cspeech LM\u5c31\u80fd\u505a\u7279\u5b9a\u7684\u4efb\u52a1\u3002\u6bd4\u5982\u5c06\u7ea2\u8272\u7684 prompt \u5f53\u4f5c\u8f93\u5165\uff0cspeech LM \u5c31\u80fd\u505a speech translation \u7684\u4efb\u52a1\u3002<\/span>\n<\/p>\n

\n
<\/span>\n<\/p>\n

\n \n<\/p>\n

\n \u6211\u4eec\u63d0\u51fa\u7684\u6846\u67b6\u6709\u4ee5\u4e0b\u4f18\u70b9\uff1a<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n 1.<\/span>\u65e0\u6587\u672c (Textless)\uff1a<\/span><\/strong>\u6211\u4eec\u7684\u6846\u67b6\u4ee5\u53ca\u5176\u6240\u4f9d\u8d56\u7684\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u72ec\u7acb\u4e8e\u6587\u5b57\u6570\u636e\uff0c\u62e5\u6709\u65e0\u53ef\u4f30\u91cf\u7684\u4ef7\u503c\u3002\u6bd5\u7adf\uff0c\u83b7\u53d6\u6807\u8bb0\u6587\u672c\u4e0e\u8bed\u97f3\u914d\u5bf9\u7684\u8fc7\u7a0b\u8017\u65f6\u7e41\u7410\uff0c\u800c\u4e14\u5728\u67d0\u4e9b\u8bed\u8a00\u4e2d\u751a\u81f3\u65e0\u6cd5\u627e\u5230\u5408\u9002\u7684\u6587\u672c\u3002\u65e0\u9700\u6587\u5b57\u7684\u7279\u6027\u4f7f\u5f97\u6211\u4eec\u7684\u5f3a\u5927\u8bed\u97f3\u751f\u6210\u80fd\u529b\u5f97\u4ee5\u8986\u76d6\u5404\u79cd\u8bed\u8a00\u9700\u6c42\uff0c\u8ba9\u5168\u4eba\u7c7b\u53d7\u76ca\u532a\u6d45\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n 2.<\/span>\u591a\u529f\u80fd\u6027 (Versatility)\uff1a<\/span><\/strong>\u6211\u4eec\u5f00\u53d1\u7684\u6846\u67b6\u901a\u7528\u6027\u6781\u9ad8\uff0c\u80fd\u5e94\u7528\u4e8e\u5404\u79cd\u5404\u6837\u7684\u8bed\u97f3\u751f\u6210\u4efb\u52a1\u3002\u6587\u7ae0\u4e2d\u7684\u5b9e\u9a8c\u4f7f\u7528\u8bed\u97f3\u7ffb\u8bd1\u3001\u8bed\u97f3\u4fee\u590d\u3001\u8bed\u97f3\u8fde\u7eed\u5f53\u4f5c\u4f8b\u5b50\u3002 <\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n 3.<\/span>\u7b80\u6613\u6027 (Easy to follow)\uff1a<\/span><\/strong>\u6211\u4eec\u63d0\u51fa\u7684\u6846\u67b6\u4e3a\u5404\u7c7b\u8bed\u97f3\u751f\u6210\u4efb\u52a1\u63d0\u4f9b\u4e86\u901a\u7528\u89e3\u51b3\u65b9\u6848\uff0c\u8ba9\u8bbe\u8ba1\u4e0b\u6e38\u6a21\u578b\u548c\u635f\u5931\u51fd\u6570\u53d8\u5f97\u8f7b\u800c\u6613\u4e3e\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n 4.<\/span>\u53ef\u8fc1\u79fb\u6027 (Transferability)\uff1a<\/span><\/strong>\u6211\u4eec\u7684\u6846\u67b6\u4e0d\u4ec5\u5bb9\u6613\u9002\u5e94\u672a\u6765\u66f4\u5148\u8fdb\u7684\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\uff0c\u8fd8\u8574\u85cf\u7740\u5de8\u5927\u7684\u6f5c\u529b\uff0c\u8ba9\u6548\u7387\u548c\u6548\u679c\u5f97\u5230\u8fdb\u4e00\u6b65\u63d0\u5347\u3002\u5c24\u5176\u4ee4\u4eba\u632f\u594b\u7684\u662f\uff0c\u968f\u7740\u5148\u8fdb\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u5373\u5c06\u95ee\u4e16\uff0c\u6211\u4eec\u7684\u6846\u67b6\u5c06\u8fce\u6765\u66f4\u4e3a\u5f3a\u5927\u7684\u53d1\u5c55\u3002 <\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n 5.<\/span>\u7ecf\u6d4e\u6027 (Affordability)\uff1a<\/span><\/strong>\u6211\u4eec\u7684\u6846\u67b6\u7ecf\u8fc7\u7cbe\u5fc3\u7684\u8bbe\u8ba1\uff0c\u53ea\u9700\u8bad\u7ec3\u5c11\u91cf\u53c2\u6570\uff0c\u800c\u4e0d\u662f\u6574\u4e2a\u5e9e\u5927\u7684\u8bed\u8a00\u6a21\u578b\u3002\u8fd9\u6781\u5927\u5730\u51cf\u8f7b\u4e86\u8ba1\u7b97\u8d1f\u62c5\uff0c\u5e76\u5141\u8bb8\u5728GTX 2080 GPU\u4e0a\u6267\u884c\u8bad\u7ec3\u8fc7\u7a0b\u3002\u5927\u5b66\u7684\u5b9e\u9a8c\u5ba4\u4e5f\u80fd\u8d1f\u62c5\u5f97\u8d77\u8fd9\u6837\u7684\u8fd0\u7b97\u5f00\u9500\u3002<\/span>
\n<\/section>\n

\n \n<\/p>\n

\n
\n
\n
\n
\n
\n <\/section>\n<\/section>\n
\n
\n
\n SpeechGen<\/strong><\/span>
\n <\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n
\n <\/span>
\n<\/section>\n

\n \n<\/p>\n

\n \u6211\u4eec\u7684\u7814\u7a76\u65b9\u6cd5\u5728\u4e8e\u6784\u5efa\u4e00\u4e2a\u5168\u65b0\u7684\u6846\u67b6 SpeechGen\uff0c\u8be5\u6846\u67b6\u4e3b\u8981\u7528\u4e8e\u5229\u7528\u8bed\u97f3\u8bed\u8a00\u6a21\u578b \uff08Speech Language Model, Speech LM\uff09\u8fdb\u884c\u5404\u79cd\u4e0b\u6e38\u8bed\u97f3\u751f\u6210\u4efb\u52a1\u7684\u5fae\u8c03\u3002\u5728\u8bad\u7ec3\u8fc7\u7a0b\u4e2d\uff0cSpeech LMs\u7684\u53c2\u6570\u4fdd\u6301\u4e0d\u53d8\uff0c\u6211\u4eec\u7684\u65b9\u6cd5\u4fa7\u91cd\u4e8e\u5b66\u4e60\u4efb\u52a1\u7279\u5b9a\u7684\u63d0\u793a\uff08Prompt\uff09\u5411\u91cf\u3002Speech LMs\u901a\u8fc7\u540c\u65f6\u5bf9\u63d0\u793a\u5411\u91cf\u548c\u8f93\u5165\u5355\u5143\u8fdb\u884c\u6761\u4ef6\u8bbe\u7f6e\uff0c\u6709\u6548\u5730\u751f\u6210\u7279\u5b9a\u8bed\u97f3\u751f\u6210\u4efb\u52a1\u6240\u9700\u7684\u8f93\u51fa\u3002\u7136\u540e\uff0c\u8fd9\u4e9b\u79bb\u6563\u5355\u5143\u8f93\u51fa\u88ab\u8f93\u5165\u5230\u57fa\u4e8e\u5355\u5143\u7684\u8bed\u97f3\u5408\u6210\u5668\u4e2d\uff0c\u751f\u6210\u5bf9\u5e94\u7684\u6ce2\u5f62\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n \u6211\u4eec\u7684 SpeechGen \u6846\u67b6\u7531\u4e09\u4e2a\u5143\u7d20\u7ec4\u6210\uff1a\u8bed\u97f3\u7f16\u7801\u5668\u3001Speech LM \u548c\u8bed\u97f3\u89e3\u7801\u5668\uff08Speech Decoder\uff09\u3002\u9996\u5148\uff0c\u8bed\u97f3\u7f16\u7801\u5668\u5c06\u6ce2\u5f62\u4f5c\u4e3a\u8f93\u5165\uff0c\u5e76\u5c06\u5176\u8f6c\u6362\u4e3a\u7531\u6709\u9650\u8bcd\u6c47\u8868\u5bfc\u51fa\u7684\u5355\u4f4d\u5e8f\u5217\u3002\u4e3a\u4e86\u7f29\u77ed\u5e8f\u5217\u957f\u5ea6\uff0c\u4f1a\u79fb\u9664\u91cd\u590d\u7684\u8fde\u7eed\u5355\u4f4d\u4ee5\u751f\u6210\u538b\u7f29\u7684\u5355\u4f4d\u5e8f\u5217\u3002\u7136\u540e\uff0cSpeech LM \u4f5c\u4e3a\u5355\u4f4d\u5e8f\u5217\u7684\u8bed\u8a00\u6a21\u578b\uff0c\u901a\u8fc7\u9884\u6d4b\u524d\u4e00\u5355\u4f4d\u548c\u5355\u4f4d\u5e8f\u5217\u7684\u540e\u7eed\u5355\u4f4d\u6765\u4f18\u5316\u53ef\u80fd\u6027\u3002\u6211\u4eec\u5bf9 Speech LM \u8fdb\u884c\u63d0\u793a\u8c03\u6574\uff0c\u4ee5\u5f15\u5bfc\u5176\u6839\u636e\u4efb\u52a1\u751f\u6210\u9002\u5f53\u7684\u5355\u4f4d\u3002\u6700\u540e\uff0cSpeech LM\u751f\u6210\u7684\u6807\u8bb0\u7531\u8bed\u97f3\u89e3\u7801\u5668\u5904\u7406\uff0c\u5c06\u5176\u8f6c\u6362\u56de\u6ce2\u5f62\u3002\u5728\u6211\u4eec\u7684\u63d0\u793a\u8c03\u6574\u7b56\u7565\u4e2d\uff0c\u63d0\u793a\u5411\u91cf\u4f1a\u5728\u8f93\u5165\u5e8f\u5217\u7684\u5f00\u59cb\u5904\u63d2\u5165\uff0c\u8fd9\u5c06\u5f15\u5bfc Speech LMs \u5728\u751f\u6210\u8fc7\u7a0b\u4e2d\u7684\u65b9\u5411\u3002\u5177\u4f53\u63d2\u5165\u7684\u63d0\u793a\u6570\u91cf\uff0c\u5219\u53d6\u51b3\u4e8e Speech LMs \u7684\u67b6\u6784\u3002\u5728\u5e8f\u5217\u5230\u5e8f\u5217\u7684\u6a21\u578b\u4e2d\uff0c\u7f16\u7801\u5668\u8f93\u5165\u548c\u89e3\u7801\u5668\u8f93\u5165\u90fd\u4f1a\u52a0\u5165\u63d0\u793a\uff0c\u4f46\u5728\u53ea\u6709\u7f16\u7801\u5668\u6216\u53ea\u6709\u89e3\u7801\u5668\u7684\u67b6\u6784\u4e2d\uff0c\u53ea\u4f1a\u5728\u8f93\u5165\u5e8f\u5217\u524d\u9762\u6dfb\u52a0\u4e00\u4e2a\u63d0\u793a\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n \u5728\u5e8f\u5217\u5230\u5e8f\u5217\u7684 Speech LMs\uff08\u5982mBART\uff09\u4e2d\uff0c\u6211\u4eec\u91c7\u7528\u4e86\u81ea\u6211\u76d1\u7763\u5b66\u4e60\u6a21\u578b\uff08\u5982HuBERT\uff09\u6765\u5904\u7406\u8f93\u5165\u548c\u76ee\u6807\u8bed\u97f3\u3002\u8fd9\u6837\u505a\u53ef\u4ee5\u4e3a\u8f93\u5165\u751f\u6210\u79bb\u6563\u5355\u5143\uff0c\u5e76\u4e3a\u76ee\u6807\u751f\u6210\u5bf9\u5e94\u7684\u79bb\u6563\u5355\u5143\u3002\u6211\u4eec\u5728\u7f16\u7801\u5668\u548c\u89e3\u7801\u5668\u8f93\u5165\u7684\u524d\u9762\u90fd\u6dfb\u52a0\u4e86\u63d0\u793a\u5411\u91cf\uff0c\u4ee5\u6784\u9020\u8f93\u5165\u5e8f\u5217\u3002\u6b64\u5916\uff0c\u6211\u4eec\u8fd8\u901a\u8fc7\u66ff\u6362\u6ce8\u610f\u529b\u673a\u5236\u4e2d\u7684\u5173\u952e\u503c\u5bf9\uff0c\u4ee5\u8fdb\u4e00\u6b65\u589e\u5f3a\u63d0\u793a\u7684\u6307\u5bfc\u80fd\u529b\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n \u5728\u6a21\u578b\u8bad\u7ec3\u4e2d\uff0c\u6211\u4eec\u4ee5\u4ea4\u53c9\u71b5\u635f\u5931\u4f5c\u4e3a\u6240\u6709\u751f\u6210\u4efb\u52a1\u7684\u76ee\u6807\u51fd\u6570\uff0c\u901a\u8fc7\u6bd4\u8f83\u6a21\u578b\u7684\u9884\u6d4b\u7ed3\u679c\u548c\u76ee\u6807\u79bb\u6563\u5355\u5143\u6807\u7b7e\u6765\u8ba1\u7b97\u635f\u5931\u3002\u5728\u8fd9\u4e2a\u8fc7\u7a0b\u4e2d\uff0c\u63d0\u793a\u5411\u91cf\u662f\u6a21\u578b\u4e2d\u552f\u4e00\u9700\u8981\u8bad\u7ec3\u7684\u53c2\u6570\uff0c\u800cSpeech LMs\u7684\u53c2\u6570\u5728\u8bad\u7ec3\u8fc7\u7a0b\u4e2d\u4fdd\u6301\u4e0d\u53d8\uff0c\u8fd9\u786e\u4fdd\u4e86\u6a21\u578b\u884c\u4e3a\u7684\u4e00\u81f4\u6027\u3002\u6211\u4eec\u901a\u8fc7\u63d2\u5165\u63d0\u793a\u5411\u91cf\uff0c\u5f15\u5bfc Speech LMs \u4ece\u8f93\u5165\u4e2d\u63d0\u53d6\u4efb\u52a1\u7279\u5b9a\u4fe1\u606f\uff0c\u5e76\u63d0\u9ad8\u4ea7\u751f\u7b26\u5408\u7279\u5b9a\u8bed\u97f3\u751f\u6210\u4efb\u52a1\u7684\u8f93\u51fa\u7684\u53ef\u80fd\u6027\u3002\u8fd9\u79cd\u65b9\u6cd5\u5141\u8bb8\u6211\u4eec\u5fae\u8c03\u5e76\u8c03\u6574 Speech LMs \u7684\u884c\u4e3a\uff0c\u800c\u65e0\u9700\u4fee\u6539\u5176\u57fa\u7840\u53c2\u6570\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n \u603b\u7684\u6765\u8bf4\uff0c\u6211\u4eec\u7684\u7814\u7a76\u65b9\u6cd5\u57fa\u4e8e\u4e00\u79cd\u5168\u65b0\u7684\u6846\u67b6 SpeechGen\uff0c\u901a\u8fc7\u8bad\u7ec3\u63d0\u793a\u5411\u91cf\uff0c\u5f15\u5bfc\u6a21\u578b\u7684\u751f\u6210\u8fc7\u7a0b\uff0c\u5e76\u4f7f\u5176\u80fd\u6709\u6548\u5730\u4ea7\u751f\u7b26\u5408\u7279\u5b9a\u8bed\u97f3\u751f\u6210\u4efb\u52a1\u7684\u8f93\u51fa\u3002<\/span>
\n<\/section>\n

\n \n<\/p>\n

\n
\n
\n
\n
\n
\n <\/section>\n<\/section>\n
\n
\n
\n \u5b9e \u9a8c<\/strong><\/span>
\n <\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n
\n <\/span>
\n<\/section>\n
\n \u6211\u4eec\u7684\u6846\u67b6\u53ef\u4ee5\u7528\u4e8e\u4efb\u610f\u7684 speech LM \u53ca\u5404\u7c7b\u751f\u6210\u4efb\u52a1\uff0c\u5177\u6709\u5f88\u597d\u7684\u6f5c\u529b\u3002\u5728\u6211\u4eec\u7684\u5b9e\u9a8c\u4e2d\uff0c\u7531\u4e8e VALL-E \u548c AudioLM \u4e0d\u662f\u5f00\u6e90\u7684\uff0c\u6211\u4eec\u9009\u62e9\u4f7f\u7528 Unit mBART \u4f5c\u4e3a speech LM \u8fdb\u884c\u6848\u4f8b\u7814\u7a76\u3002\u6211\u4eec\u7528\u8bed\u97f3\u7ffb\u8bd1 (speech translation)\u3001\u8bed\u97f3\u4fee\u590d (speech inpainting)\u3001\u8bed\u97f3\u8fde\u7eed (speech continuation) \u5f53\u4f5c\u4f8b\u5b50\uff0c\u6765\u5c55\u793a\u6211\u4eec\u7684\u6846\u67b6\u7684\u80fd\u529b\u3002\u8fd9\u4e09\u4e2a\u4efb\u52a1\u7684\u793a\u610f\u56fe\u5982\u4e0b\u56fe\u6240\u793a\u3002\u6240\u6709\u7684\u4efb\u52a1\u90fd\u662f\u8bed\u97f3\u8f93\u5165\uff0c\u8bed\u97f3\u8f93\u51fa\uff0c\u65e0\u9700\u6587\u672c\u7684\u5e2e\u52a9\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n
\n<\/section>\n

\n \n<\/p>\n

\n
\n
\n
\n
\n
\n <\/section>\n<\/section>\n
\n
\n
\n \u8bed\u97f3\u7ffb\u8bd1<\/strong><\/span>
\n <\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n
\n <\/span>
\n<\/section>\n
\n \u6211\u4eec\u5728\u8bad\u7ec3\u8bed\u97f3\u7ffb\u8bd1 (speech translation) \u65f6\uff0c\u7528\u7684\u65f6\u897f\u73ed\u7259\u6587\u8f6c\u82f1\u6587\u7684\u4efb\u52a1\u3002\u6211\u4eec\u7ed9\u6a21\u578b\u8f93\u5165\u897f\u73ed\u7259\u8bed\u7684\u8bed\u97f3\uff0c\u5e0c\u671b\u6a21\u578b\u4ea7\u751f\u82f1\u6587\u7684\u8bed\u97f3\uff0c\u6574\u4e2a\u8fc7\u7a0b\u65e0\u9700\u6587\u672c\u5e2e\u52a9\u3002\u4ee5\u4e0b\u662f\u51e0\u4e2a\u8bed\u97f3\u7ffb\u8bd1\u7684\u4f8b\u5b50\uff0c\u6211\u4eec\u4f1a\u5c55\u793a\u6b63\u786e\u7b54\u6848 (ground truth) \u4e0e\u6a21\u578b\u7684\u9884\u6d4b (model prediction)\u3002\u8fd9\u4e9b\u6f14\u793a\u793a\u4f8b\u8868\u660e\u6a21\u578b\u7684\u9884\u6d4b\u6355\u6349\u5230\u4e86\u6b63\u786e\u7b54\u6848\u7684\u6838\u5fc3\u542b\u4e49\u3002<\/span>
\n<\/section>\n
\n
\n<\/section>\n
\n
\n<\/section>\n
\n <\/span>
\n<\/section>\n

\n \n<\/p>\n

\n
\n
\n
\n
\n
\n <\/section>\n<\/section>\n
\n
\n
\n \u8bed\u97f3\u4fee\u8865<\/strong><\/span>
\n <\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n
\n <\/span>
\n<\/section>\n
\n \u5728\u6211\u4eec\u8fdb\u884c\u8bed\u97f3\u4fee\u8865 (speech inpainting) \u7684\u5b9e\u9a8c\u4e2d\uff0c\u6211\u4eec\u7279\u522b\u9009\u53d6\u8d85\u8fc7 2.5 \u79d2\u7684\u97f3\u9891\u7247\u6bb5\u4f5c\u4e3a\u540e\u7eed\u5904\u7406\u7684\u76ee\u6807\u8bed\u97f3\uff0c\u5e76\u901a\u8fc7\u968f\u673a\u9009\u62e9\u8fc7\u7a0b\u6311\u9009\u51fa\u4e00\u6bb5\u65f6\u957f\u4ecb\u4e8e 0.8 \u81f3 1.2 \u79d2\u7684\u8bed\u97f3\u7247\u6bb5\u3002\u7136\u540e\u6211\u4eec\u5bf9\u9009\u51fa\u7684\u7247\u6bb5\u8fdb\u884c\u906e\u853d\uff0c\u6a21\u62df\u8bed\u97f3\u4fee\u8865\u4efb\u52a1\u4e2d\u7f3a\u5931\u6216\u53d7\u635f\u7684\u90e8\u5206\u3002\u6211\u4eec\u4f7f\u7528\u8bcd\u9519\u8bef\u7387 (WER) \u548c\u5b57\u7b26\u9519\u8bef\u7387 (CER) \u4f5c\u4e3a\u8bc4\u4f30\u53d7\u635f\u7247\u6bb5\u4fee\u590d\u7a0b\u5ea6\u7684\u6307\u6807\u3002<\/span>
\n<\/section>\n
\n          
<\/span>
\n<\/section>\n
\n \u5bf9 SpeechGen \u751f\u6210\u7684\u8f93\u51fa\u4e0e\u53d7\u635f\u8bed\u97f3\u8fdb\u884c\u6bd4\u8f83\u5206\u6790\uff0c\u6211\u4eec\u7684\u6a21\u578b\u53ef\u4ee5\u663e\u8457\u91cd\u5efa\u53e3\u8bed\u8bcd\u6c47\uff0c\u5c06 WER \u4ece 41.68% \u964d\u4f4e\u5230 28.61%\uff0c\u5c06 CER \u4ece 25.10% \u964d\u4f4e\u5230 10.75%\uff0c\u5982\u4e0b\u8868\u6240\u793a\u3002\u8fd9\u610f\u5473\u7740\u6211\u4eec\u63d0\u51fa\u7684\u65b9\u6cd5\u80fd\u591f\u663e\u8457\u63d0\u9ad8\u8bed\u97f3\u91cd\u5efa\u7684\u80fd\u529b\uff0c\u6700\u7ec8\u4fc3\u8fdb\u8bed\u97f3\u8f93\u51fa\u7684\u51c6\u786e\u6027\u548c\u53ef\u7406\u89e3\u6027\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n

\n \n<\/p>\n

\n \u4e0b\u56fe\u662f\u4e00\u4e2a\u5c55\u793a\u6837\u4f8b\uff0c\u4e0a\u9762\u7684\u5b50\u56fe\u662f\u53d7\u635f\u7684\u8bed\u97f3\uff0c\u4e0b\u9762\u7684\u5b50\u56fe\u662f SpeechGen \u4ea7\u751f\u7684\u8bed\u97f3\uff0c\u53ef\u4ee5\u770b\u5230\uff0cSpeechGen \u5f88\u597d\u5730\u4fee\u590d\u4e86\u53d7\u635f\u7684\u8bed\u97f3\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n
\n<\/section>\n

\n \n<\/p>\n

\n
\n
\n
\n
\n
\n <\/section>\n<\/section>\n
\n
\n
\n \u8bed\u97f3\u8fde\u7eed<\/strong><\/span>
\n <\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n
\n <\/span>
\n<\/section>\n
\n \u6211\u4eec\u5c06\u901a\u8fc7 LJSpeech \u5c55\u793a\u8bed\u97f3\u8fde\u7eed\u4efb\u52a1\u7684\u5b9e\u9645\u5e94\u7528\u3002\u5728\u8bad\u7ec3\u63d0\u793a\uff08prompt\uff09\u671f\u95f4\uff0c\u6211\u4eec\u7684\u7b56\u7565\u662f\u8ba9\u6a21\u578b\u53ea\u770b\u5230\u7247\u6bb5\u7684\u79cd\u5b50\u7247\u6bb5\uff08seed segment)\uff0c\u8fd9\u4e2a seed segment \u5360\u636e\u4e86\u8bed\u97f3\u603b\u957f\u5ea6\u7684\u4e00\u4e2a\u6bd4\u4f8b\uff0c\u6211\u4eec\u5c06\u5176\u79f0\u4e3a\u6761\u4ef6\u6bd4\u7387\uff08condition ratio, r)\uff0c\u4e26\u8b93\u6a21\u578b\u63a5\u7e8c\u7522\u751f\u5f8c\u7e8c\u7684\u8a9e\u97f3\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n \u4ee5\u4e0b\uff0c\u6211\u4eec\u4e3a\u60a8\u5c55\u793a\u4e00\u4e9b\u5b9e\u4f8b\u3002\u9ed1\u8272\u7684\u6587\u5b57\u4ee3\u8868\u79cd\u5b50\u7247\u6bb5\uff08seed segment\uff09\uff0c\u800c\u7ea2\u8272\u7684\u6587\u5b57\u5219\u662f SpeechGen \u751f\u6210\u7684\u53e5\u5b50\uff08\u8fd9\u91cc\u7684\u6587\u5b57\u9996\u5148\u7ecf\u8fc7\u8bed\u97f3\u8bc6\u522b\u5f97\u5230\u7684\u7ed3\u679c\u3002\u5728\u8bad\u7ec3\u548c\u63a8\u7406\u8fc7\u7a0b\u4e2d\uff0c\u6a21\u578b\u5b8c\u5168\u8fdb\u884c\u7684\u662f\u8bed\u97f3\u5230\u8bed\u97f3\u7684\u4efb\u52a1\uff0c\u4e14\u5b8c\u5168\u4e0d\u63a5\u6536\u4efb\u4f55\u6587\u5b57\u4fe1\u606f\uff09\u3002\u4e0d\u540c\u7684\u6761\u4ef6\u6bd4\u7387\u4f7f SpeechGen \u80fd\u591f\u751f\u6210\u4e0d\u540c\u957f\u5ea6\u7684\u8bed\u53e5\u4ee5\u5b9e\u73b0\u8fde\u8d2f\u6027\uff0c\u5e76\u5b8c\u6210\u4e00\u53e5\u5b8c\u6574\u7684\u8bdd\u3002\u4ece\u8d28\u91cf\u89d2\u5ea6\u770b\uff0c\u53ef\u4ee5\u89c2\u5bdf\u5230\u751f\u6210\u7684\u53e5\u5b50\u4e0e\u79cd\u5b50\u7247\u6bb5\u5728\u8bed\u6cd5\u4e0a\u57fa\u672c\u4e00\u81f4\uff0c\u5e76\u4e14\u8bed\u4e49\u76f8\u5173\u3002\u867d\u7136\uff0c\u751f\u6210\u7684\u8bed\u97f3\u4ecd\u7136\u65e0\u6cd5\u5b8c\u7f8e\u5730\u4f20\u8fbe\u4e00\u4e2a\u5b8c\u6574\u7684\u610f\u601d\u3002\u6211\u4eec\u9884\u671f\u8fd9\u4e2a\u95ee\u9898\u5c06\u5728\u672a\u6765\u66f4\u5f3a\u5927\u7684\u8bed\u97f3\u6a21\u578b\u4e2d\u5f97\u5230\u89e3\u51b3\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n
\n<\/section>\n

\n \n<\/p>\n

\n
\n
\n
\n
\n
\n <\/section>\n<\/section>\n
\n
\n
\n \u4e0d\u8db3\u4e0e\u672a\u6765\u65b9\u5411<\/strong><\/span>
\n <\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n
\n <\/span>
\n<\/section>\n
\n \u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u548c\u8bed\u97f3\u751f\u6210\u6b63\u5904\u4e8e\u84ec\u52c3\u53d1\u5c55\u7684\u9636\u6bb5\uff0c\u800c\u6211\u4eec\u7684\u6846\u67b6\u5219\u63d0\u4f9b\u4e86\u4e00\u79cd\u5de7\u5999\u5730\u5229\u7528\u5f3a\u5927\u8bed\u8a00\u6a21\u578b\u8fdb\u884c\u8bed\u97f3\u751f\u6210\u7684\u53ef\u80fd\u6027\u3002\u7136\u800c\uff0c\u8fd9\u4e2a\u6846\u67b6\u4ecd\u6709\u4e00\u4e9b\u5c1a\u5f85\u5b8c\u5584\u4e4b\u5904\uff0c\u4e5f\u6709\u8bb8\u591a\u503c\u5f97\u6211\u4eec\u6df1\u5165\u7814\u7a76\u7684\u95ee\u9898\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n 1.<\/span>\u4e0e\u57fa\u4e8e\u6587\u672c\u7684\u8bed\u8a00\u6a21\u578b\u76f8\u6bd4\uff0c\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u76ee\u524d\u8fd8\u5904\u4e8e\u53d1\u5c55\u7684\u521d\u7ea7\u9636\u6bb5\u3002\u867d\u7136\u6211\u4eec\u63d0\u51fa\u7684\u63d0\u793a\u6846\u67b6\u80fd\u6fc0\u53d1\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u505a\u8bed\u97f3\u751f\u6210\u4efb\u52a1\uff0c\u4f46\u5e76\u4e0d\u80fd\u8fbe\u5230\u5353\u8d8a\u7684\u6027\u80fd\u3002\u4e0d\u8fc7\uff0c\u968f\u7740\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u7684\u4e0d\u65ad\u8fdb\u6b65\uff0c\u6bd4\u5982\u4ece GSLM \u5230 Unit mBART \u7684\u5927\u8f6c\u8eab\uff0c\u63d0\u793a\u7684\u8868\u73b0\u6709\u4e86\u660e\u663e\u7684\u63d0\u5347\u3002\u7279\u522b\u662f\u4ee5\u524d\u5bf9 GSLM \u800c\u8a00\u5177\u6709\u6311\u6218\u6027\u7684\u4efb\u52a1\uff0c\u73b0\u5728\u5728 Unit mBART \u4e0b\u8868\u73b0\u51fa\u66f4\u597d\u7684\u6027\u80fd\u3002\u6211\u4eec\u9884\u8ba1\u672a\u6765\u4f1a\u51fa\u73b0\u66f4\u591a\u5148\u8fdb\u7684\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u5d2d\u9732\u5934\u89d2\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n 2.<\/span>\u8d85\u8d8a\u5185\u5bb9\u4fe1\u606f\uff1a\u5f53\u524d\u7684\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u5e76\u4e0d\u80fd\u5b8c\u5168\u6355\u6349\u5230\u8bf4\u8bdd\u8005\u548c\u60c5\u611f\u4fe1\u606f\uff0c\u8fd9\u7ed9\u5f53\u524d\u7684\u8bed\u97f3\u63d0\u793a\u6846\u67b6\u5728\u6709\u6548\u5904\u7406\u8fd9\u4e9b\u4fe1\u606f\u65b9\u9762\u5e26\u6765\u4e86\u6311\u6218\u3002\u4e3a\u4e86\u514b\u670d\u8fd9\u4e2a\u9650\u5236\uff0c\u6211\u4eec\u6709\u4e00\u4e2a\u65b9\u6cd5\uff1a\u5f15\u5165\u5373\u63d2\u5373\u7528\u7684\u6a21\u5757\uff0c\u4e13\u95e8\u4e3a\u6846\u67b6\u6ce8\u5165\u8bf4\u8bdd\u8005\u548c\u60c5\u611f\u4fe1\u606f\u3002\u5c55\u671b\u672a\u6765\uff0c\u6211\u4eec\u9884\u8ba1\u672a\u6765\u7684\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u5c06\u6574\u5408\u548c\u5229\u7528\u8fd9\u4e9b\u5185\u5bb9\u4e4b\u5916\u7684\u4fe1\u606f\uff0c\u4ee5\u63d0\u9ad8\u6027\u80fd\u5e76\u66f4\u597d\u5730\u5904\u7406\u8bed\u97f3\u751f\u6210\u4efb\u52a1\u4e2d\u7684\u8bf4\u8bdd\u8005\u548c\u60c5\u611f\u76f8\u5173\u65b9\u9762\u3002<\/span>
\n<\/section>\n
\n
<\/span>
\n<\/section>\n
\n 3.<\/span>\u63d0\u793a\u751f\u6210\u7684\u53ef\u80fd\u6027\uff1a\u5bf9\u4e8e\u63d0\u793a\u751f\u6210\uff0c\u6211\u4eec\u6709\u7740\u7075\u6d3b\u591a\u53d8\u7684\u9009\u62e9\uff0c\u53ef\u4ee5\u96c6\u6210\u5404\u79cd\u7c7b\u578b\u7684\u6307\u793a\uff0c\u5305\u62ec\u6587\u672c\u548c\u56fe\u50cf\u6307\u793a\u3002\u60f3\u8c61\u4e00\u4e0b\uff0c\u6211\u4eec\u53ef\u4ee5\u8bad\u7ec3\u4e00\u4e2a\u795e\u7ecf\u7f51\u7edc\uff0c\u8ba9\u5b83\u7528\u56fe\u50cf\u6216\u6587\u672c\u4f5c\u4e3a\u8f93\u5165\uff0c\u800c\u4e0d\u662f\u50cf\u672c\u6587\u4e2d\u90a3\u6837\u4f7f\u7528\u8bad\u7ec3\u597d\u7684 embedding \u5f53\u4f5c\u63d0\u793a\u3002\u8fd9\u4e2a\u8bad\u7ec3\u597d\u7684\u7f51\u7edc\u5c06\u6210\u4e3a\u6211\u4eec\u7684\u63d0\u793a\u751f\u6210\u5668\uff0c\u4e3a\u6846\u67b6\u589e\u6dfb\u4e86\u66f4\u591a\u7684\u591a\u6837\u6027\u3002\u8fd9\u6837\u7684\u65b9\u5f0f\u4f1a\u8ba9\u63d0\u793a\u751f\u6210\u53d8\u5f97\u66f4\u52a0\u6709\u8da3\u548c\u4e30\u5bcc\u591a\u5f69\u3002<\/span>
\n<\/section>\n

\n \n<\/p>\n

\n
\n
\n
\n
\n
\n <\/section>\n<\/section>\n
\n
\n
\n \u7ed3 \u8bba<\/strong><\/span>
\n <\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n
\n <\/span>
\n<\/section>\n
\n \u5728\u672c\u6587\u4e2d\uff0c\u6211\u4eec\u63a2\u7d22\u4e86\u4f7f\u7528\u63d0\u793a\u6765\u89e3\u9501\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u5728\u5404\u79cd\u751f\u6210\u4efb\u52a1\u4e2d\u7684\u6027\u80fd\u3002\u6211\u4eec\u63d0\u51fa\u4e86\u4e00\u4e2a\u540d\u4e3aSpeechGen\u7684\u7edf\u4e00\u6846\u67b6\uff0c\u8be5\u6846\u67b6\u4ec5\u6709\u7ea6 10M \u7684\u53ef\u8bad\u7ec3\u53c2\u6570\u3002\u6211\u4eec\u6240\u63d0\u51fa\u7684\u6846\u67b6\u5177\u6709\u51e0\u4e2a\u4ee4\u4eba\u6ee1\u610f\u7684\u7279\u6027\uff0c\u5305\u62ec\u65e0\u9700\u6587\u672c\u3001\u591a\u529f\u80fd\u6027\u3001\u9ad8\u6548\u6027\u3001\u53ef\u8f6c\u79fb\u6027\u548c\u53ef\u8d1f\u62c5\u6027\u3002\u4e3a\u4e86\u5c55\u793a\u6211\u4eec\u6846\u67b6\u7684\u80fd\u529b\uff0c\u6211\u4eec\u4ee5 Unit mBART \u4e3a\u6848\u4f8b\u7814\u7a76\uff0c\u5e76\u5728\u4e09\u4e2a\u4e0d\u540c\u7684\u8bed\u97f3\u751f\u6210\u4efb\u52a1\u4e0a\u8fdb\u884c\u5b9e\u9a8c\uff1a\u8bed\u97f3\u7ffb\u8bd1\u3001\u8bed\u97f3\u4fee\u590d\u548c\u8bed\u97f3\u5ef6\u7eed\u3002<\/span>
\n<\/section>\n
\n          
<\/span>
\n<\/section>\n
\n \u5f53\u8fd9\u7bc7\u8bba\u6587\u63d0\u4ea4\u5230arXiv\u65f6\uff0cGoogle\u63d0\u51fa\u4e86\u4e00\u79cd\u66f4\u5148\u8fdb\u7684\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u2014\u2014SPECTRON\uff0c\u5b83\u4e3a\u6211\u4eec\u5c55\u793a\u4e86\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u5728\u5efa\u6a21\u8bf4\u8bdd\u4eba\u548c\u60c5\u611f\u7b49\u4fe1\u606f\u7684\u53ef\u80fd\u6027\u3002\u8fd9\u65e0\u7591\u662f\u4e00\u4e2a\u4ee4\u4eba\u5174\u594b\u7684\u6d88\u606f\uff0c\u968f\u7740\u5148\u8fdb\u8bed\u97f3\u8bed\u8a00\u6a21\u578b\u7684\u4e0d\u65ad\u63d0\u51fa\uff0c\u6211\u4eec\u7684\u7edf\u4e00\u6846\u67b6\u5177\u6709\u5de8\u5927\u7684\u6f5c\u529b\u3002<\/span>
\n<\/section>\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n

\n <\/section>\n<\/section>\n<\/section>\n<\/section>\n
\n
\n
\n
\n
\n <\/section>\n
\n
\n
\n
\n
\n <\/section>\n<\/section>\n<\/section>\n<\/section>\n

\n <\/span>\n <\/p>\n

\n <\/mp-common-profile>
\n <\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n <\/section>\n
\n
\n
\n
\n
\n \u626b\u7801\u5173\u6ce8\u6211\u4eec<\/span>
\n <\/section>\n<\/section>\n
\n
\n \u52a9\u529bAI\u8bed\u97f3\u5f00\u53d1\u8005\u7684\u793e\u533a<\/span>
\n <\/section>\n<\/section>\n<\/section>\n
\n
\n <\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n<\/section>\n

\n <\/mp-style-type>\n<\/p>\n

\n

\n \n <\/p>\n<\/section>\n","protected":false},"excerpt":{"rendered":"

  \u8bba\u6587\u94fe\u63a5\uff1a https:\/\/arxiv.org\/pdf\/2306.02207.pdf […]<\/p>\n","protected":false},"author":7,"featured_media":3008,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[90,60],"tags":[204,203],"class_list":["post-2985","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-90","category-60","tag-speechgen","tag-203"],"_links":{"self":[{"href":"https:\/\/linguaresources.com\/index.php?rest_route=\/wp\/v2\/posts\/2985","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/linguaresources.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/linguaresources.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/linguaresources.com\/index.php?rest_route=\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/linguaresources.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2985"}],"version-history":[{"count":0,"href":"https:\/\/linguaresources.com\/index.php?rest_route=\/wp\/v2\/posts\/2985\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/linguaresources.com\/index.php?rest_route=\/wp\/v2\/media\/3008"}],"wp:attachment":[{"href":"https:\/\/linguaresources.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2985"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/linguaresources.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2985"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/linguaresources.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2985"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}