Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the betterdocs domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the jnews-view-counter domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wp-statistics domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wpdiscuz domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114

Notice: 函数 _load_textdomain_just_in_time 的调用方法不正确jnews 域的翻译加载触发过早。这通常表示插件或主题中的某些代码运行过早。翻译应在 init 操作或之后加载。 请查阅调试 WordPress来获取更多信息。 (这个消息是在 6.7.0 版本添加的。) in /data/user/htdocs/wp-includes/functions.php on line 6114

Notice: 函数 _load_textdomain_just_in_time 的调用方法不正确jnews-like 域的翻译加载触发过早。这通常表示插件或主题中的某些代码运行过早。翻译应在 init 操作或之后加载。 请查阅调试 WordPress来获取更多信息。 (这个消息是在 6.7.0 版本添加的。) in /data/user/htdocs/wp-includes/functions.php on line 6114

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /data/user/htdocs/wp-includes/functions.php:6114) in /data/user/htdocs/wp-includes/rest-api/class-wp-rest-server.php on line 1893
{"id":33908,"date":"2024-09-15T12:05:57","date_gmt":"2024-09-15T04:05:57","guid":{"rendered":"https:\/\/linguaresources.com\/?p=33908"},"modified":"2024-09-15T12:05:57","modified_gmt":"2024-09-15T04:05:57","slug":"sglang%ef%bc%9a%e9%ab%98%e6%95%88%e6%89%a7%e8%a1%8c%e7%bb%93%e6%9e%84%e5%8c%96%e8%af%ad%e8%a8%80%e6%a8%a1%e5%9e%8b%e7%a8%8b%e5%ba%8f","status":"publish","type":"post","link":"https:\/\/linguaresources.com\/?p=33908","title":{"rendered":"SGLang\uff1a\u9ad8\u6548\u6267\u884c\u7ed3\u6784\u5316\u8bed\u8a00\u6a21\u578b\u7a0b\u5e8f"},"content":{"rendered":"

 <\/p>\n

\u5927\u578b\u8bed\u8a00\u6a21\u578b\uff08LLM\uff09\u8d8a\u6765\u8d8a\u591a\u5730\u7528\u4e8e\u9700\u8981\u591a\u6b21\u751f\u6210\u8c03\u7528\u3001\u9ad8\u7ea7\u63d0\u793a\u6280\u672f\u3001\u63a7\u5236\u6d41\u548c\u7ed3\u6784\u5316\u8f93\u5165\/\u8f93\u51fa\u7684\u590d\u6742\u4efb\u52a1\u3002 \u7136\u800c\uff0c\u76ee\u524d\u8fd8\u7f3a\u4e4f\u7528\u4e8e\u7f16\u7a0b\u548c\u6267\u884c\u8fd9\u4e9b\u5e94\u7528\u7684\u9ad8\u6548\u7cfb\u7edf\u3002 SGLang \u662f\u4e00\u79cd\u65b0\u63a8\u51fa\u7684\u7cfb\u7edf\uff0c\u65e8\u5728\u901a\u8fc7\u63d0\u4f9b\u590d\u6742\u8bed\u8a00\u6a21\u578b\u7a0b\u5e8f\u7684\u9ad8\u6548\u6267\u884c\u6765\u89e3\u51b3\u8fd9\u4e00\u95ee\u9898\u3002 SGLang \u7531\u524d\u7aef\u8bed\u8a00\u548c\u8fd0\u884c\u65f6\u7ec4\u6210\u3002 \u524d\u7aef\u8bed\u8a00\u901a\u8fc7\u7528\u4e8e\u751f\u6210\u548c\u5e76\u884c\u6027\u63a7\u5236\u7684\u57fa\u5143\u7b80\u5316\u4e86\u7f16\u7a0b\uff0c\u800c\u8fd0\u884c\u65f6\u5219\u901a\u8fc7\u65b0\u9896\u7684\u4f18\u5316\u52a0\u901f\u4e86\u6267\u884c\uff0c\u5982\u7528\u4e8eKV\u7f13\u5b58\u91cd\u7528\u7684RadixAttention\u548c\u7528\u4e8e\u66f4\u5feb\u7ed3\u6784\u5316\u8f93\u51fa\u89e3\u7801\u7684\u538b\u7f29\u6709\u9650\u72b6\u6001\u673a\u3002 \u5b9e\u9a8c\u8bc1\u660e\uff0c\u4e0e\u6700\u5148\u8fdb\u7684\u63a8\u7406\u7cfb\u7edf\u76f8\u6bd4\uff0cSGLang \u5728\u5404\u79cd\u5927\u578b\u8bed\u8a00\u548c\u591a\u6a21\u6001\u6a21\u578b\u4e0a\u7684\u541e\u5410\u91cf\u6700\u591a\u53ef\u63d0\u9ad8 6.4 \u500d\uff0c\u53ef\u5904\u7406\u7684\u4efb\u52a1\u5305\u62ec\u4ee3\u7406\u63a7\u5236\u3001\u903b\u8f91\u63a8\u7406\u3001\u5c11\u91cf\u5b66\u4e60\u57fa\u51c6\u3001JSON \u89e3\u7801\u3001\u68c0\u7d22\u589e\u5f3a\u751f\u6210\u6d41\u6c34\u7ebf\u548c\u591a\u8f6e\u804a\u5929\u3002<\/p>\n

LLM \u529f\u80fd\u7684\u6700\u65b0\u8fdb\u5c55\u6269\u5927\u4e86\u5b83\u4eec\u7684\u7528\u9014\uff0c\u4f7f\u5b83\u4eec\u80fd\u591f\u5904\u7406\u66f4\u5e7f\u6cdb\u7684\u4e00\u822c\u4efb\u52a1\uff0c\u5e76\u53d1\u6325\u81ea\u4e3b\u4ee3\u7406\u7684\u529f\u80fd\u3002 \u5728\u8fd9\u4e9b\u5e94\u7528\u4e2d\uff0cLLM \u53c2\u4e0e\u591a\u8f6e\u89c4\u5212\u3001\u63a8\u7406\u4ee5\u53ca\u4e0e\u5916\u90e8\u73af\u5883\u7684\u4ea4\u4e92\u3002 \u8fd9\u53ef\u4ee5\u901a\u8fc7\u5de5\u5177\u4f7f\u7528\u3001\u591a\u79cd\u8f93\u5165\u6a21\u5f0f\u548c\u5404\u79cd\u63d0\u793a\u6280\u672f\uff08\u5982\u5c11\u91cf\u5b66\u4e60\u3001\u81ea\u6211\u4e00\u81f4\u6027\u3001\u601d\u7ef4\u9aa8\u67b6\u548c\u601d\u7ef4\u6811\uff09\u6765\u5b9e\u73b0\u3002 \u8fd9\u4e9b\u65b0\u7684\u4f7f\u7528\u6848\u4f8b\u9700\u8981\u591a\u6b21\uff08\u901a\u5e38\u662f\u4f9d\u8d56\u6027\u7684\uff09LLM \u751f\u6210\u8c03\u7528\uff0c\u8868\u660e\u4e86\u4f7f\u7528\u591a\u8c03\u7528\u7ed3\u6784\u5b8c\u6210\u590d\u6742\u4efb\u52a1\u7684\u8d8b\u52bf\u3002<\/p>\n

\u8fd9\u79cd\u8f6c\u53d8\u6807\u5fd7\u7740 LLM \u4ece\u7b80\u5355\u7684\u804a\u5929\u8fc7\u6e21\u5230\u66f4\u590d\u6742\u7684\u7a0b\u5e8f\u5316\u4f7f\u7528\uff0c\u5373\u7a0b\u5e8f\u5b89\u6392\u548c\u63a7\u5236 LLM \u7684\u751f\u6210\u8fc7\u7a0b\u3002 \u8fd9\u4e9b\u7a0b\u5e8f\u88ab\u79f0\u4e3a “\u8bed\u8a00\u6a21\u578b\u7a0b\u5e8f”\uff08LM Programs\uff09\u3002 \u9ad8\u7ea7\u63d0\u793a\u6280\u672f\u548c\u4ee3\u7406\u5de5\u4f5c\u6d41\u7a0b\u90fd\u5c5e\u4e8e LM \u7a0b\u5e8f\u7684\u8303\u7574\u3002 LM \u7a0b\u5e8f\u6709\u4e24\u4e2a\u5171\u540c\u7279\u6027\uff1a(1) LM \u7a0b\u5e8f\u901a\u5e38\u6d89\u53ca\u591a\u4e2a LLM \u8c03\u7528\uff0c\u4e2d\u95f4\u7a7f\u63d2\u63a7\u5236\u6d41\uff0c\u4ee5\u5b8c\u6210\u590d\u6742\u4efb\u52a1\u5e76\u63d0\u9ad8\u6574\u4f53\u8d28\u91cf\u3002 (2) LM \u7a0b\u5e8f\u63a5\u6536\u7ed3\u6784\u5316\u7684\u8f93\u5165\u5e76\u4ea7\u751f\u7ed3\u6784\u5316\u7684\u8f93\u51fa\uff0c\u8fd9\u4f7f\u5f97 LM \u7a0b\u5e8f\u7684\u7ec4\u6210\u548c\u4e0e\u73b0\u6709\u8f6f\u4ef6\u7cfb\u7edf\u7684\u96c6\u6210\u6210\u4e3a\u53ef\u80fd\u3002<\/p>\n

\u5728\u672c\u6587\u4e2d\uff0c\u6211\u4eec\u5c06\u6df1\u5165\u7814\u7a76 SGLang \u6846\u67b6\uff0c\u63a2\u7d22\u5176\u67b6\u6784\uff0c\u5206\u6790\u5176\u6027\u80fd\uff0c\u5e76\u5c06\u5176\u4e0e\u6700\u5148\u8fdb\u7684\u6846\u67b6\u8fdb\u884c\u6bd4\u8f83\u3002 \u8ba9\u6211\u4eec\u5f00\u59cb\u5427\u3002<\/p>\n

SGLang <\/strong>\u7b80\u4ecb<\/strong><\/p>\n

\u5c3d\u7ba1 LM \u7a0b\u5e8f\u88ab\u5e7f\u6cdb\u4f7f\u7528\uff0c\u4f46\u76ee\u524d\u7528\u4e8e\u8868\u8fbe\u548c\u6267\u884c\u8fd9\u4e9b\u7a0b\u5e8f\u7684\u7cfb\u7edf\u4ecd\u7136\u6548\u7387\u4f4e\u4e0b\u3002 SGLang \u786e\u5b9a\u4e86\u4e0e\u9ad8\u6548\u4f7f\u7528 LM \u7a0b\u5e8f\u76f8\u5173\u7684\u4e24\u4e2a\u4e3b\u8981\u6311\u6218\uff1a<\/p>\n

    \n
  1. \u7f16\u7a0b\u590d\u6742\u6027\uff1a \u7531\u4e8e LLM \u7684\u975e\u786e\u5b9a\u6027\uff0c\u5f00\u53d1 LM \u7a0b\u5e8f\u65e2\u7e41\u7410\u53c8\u56f0\u96be\u3002 \u8fd9\u6d89\u53ca\u5927\u91cf\u7684\u5b57\u7b26\u4e32\u64cd\u4f5c\u3001\u63d0\u793a\u7684\u5b9e\u9a8c\u6027\u8c03\u6574\u3001\u8106\u5f31\u7684\u8f93\u51fa\u89e3\u6790\u3001\u5904\u7406\u591a\u79cd\u8f93\u5165\u6a21\u5f0f\u4ee5\u53ca\u5b9e\u65bd\u5e76\u884c\u673a\u5236\u3002 \u8fd9\u79cd\u590d\u6742\u6027\u751a\u81f3\u5927\u5927\u964d\u4f4e\u4e86\u7b80\u5355\u7a0b\u5e8f\u7684\u53ef\u8bfb\u6027\u3002<\/li>\n
  2. \u6267\u884c\u6548\u7387\u4f4e\uff1a \u7531\u4e8e\u5197\u4f59\u8ba1\u7b97\u548c\u5185\u5b58\u4f7f\u7528\uff0c\u6267\u884c LM \u7a0b\u5e8f\u7684\u6548\u7387\u5f88\u4f4e\u3002 \u4e3a\u51cf\u5c11\u5ef6\u8fdf\u548c\u63d0\u9ad8\u541e\u5410\u91cf\u800c\u4f18\u5316\u7684\u6700\u5148\u8fdb\u63a8\u7406\u5f15\u64ce\u7f3a\u4e4f\u5bf9\u5de5\u4f5c\u8d1f\u8f7d\u7684\u76f4\u63a5\u4e86\u89e3\uff0c\u5bfc\u81f4\u6548\u7387\u4e25\u91cd\u4f4e\u4e0b\u3002 \u4e00\u4e2a\u660e\u663e\u7684\u4f8b\u5b50\u662f\u952e\u503c\uff08KV\uff09\u7f13\u5b58\u7684\u91cd\u7528\uff0c\u5b83\u7531\u53ef\u91cd\u7528\u7684\u4e2d\u95f4\u5f20\u91cf\u7ec4\u6210\uff0c\u5bf9\u4e8e\u751f\u6210\u63a8\u7406\u81f3\u5173\u91cd\u8981\u3002 \u76ee\u524d\u7684\u7cfb\u7edf\u7f3a\u4e4f\u6709\u6548\u7684\u673a\u5236\u6765\u4fc3\u8fdb KV \u7f13\u5b58\u5728\u591a\u4e2aLLM<\/a>\u00a0\u8c03\u7528\uff0c\u5bfc\u81f4\u4e0d\u5fc5\u8981\u7684\u8ba1\u7b97\u548c\u5185\u5b58\u6d6a\u8d39\u3002 \u6b64\u5916\uff0c\u5bf9\u7ed3\u6784\u5316\u8f93\u51fa\uff08\u5982 JSON \u6a21\u5f0f\uff09\u7684\u9650\u5236\u89e3\u7801\u4e5f\u4e0d\u662f\u6700\u4f73\u9009\u62e9\uff0c\u56e0\u4e3a\u73b0\u6709\u7cfb\u7edf\u4e00\u6b21\u53ea\u80fd\u89e3\u7801\u4e00\u4e2a\u6807\u8bb0\u3002<\/li>\n<\/ol>\n

    \u4e3a\u4e86\u5e94\u5bf9\u8fd9\u4e9b\u6311\u6218\uff0cSGLang \u4e3a LLM \u5f15\u5165\u4e86\u7ed3\u6784\u5316\u751f\u6210\u8bed\u8a00\u3002 \u5176\u6838\u5fc3\u601d\u60f3\u662f\u7cfb\u7edf\u5730\u5229\u7528 LM \u7a0b\u5e8f\u4e2d\u7684\u591a\u8c03\u7528\u7ed3\u6784\uff0c\u4ee5\u5b9e\u73b0\u9ad8\u6548\u6267\u884c\u3002 \u5982\u4e0b\u56fe\u6240\u793a\uff0cSGLang \u5305\u62ec\u4e24\u4e2a\u90e8\u5206\uff1a\u524d\u7aef\u8bed\u8a00\u548c\u540e\u7aef\u8fd0\u884c\u65f6\u3002<\/p>\n

    \u524d\u7aef\u8bed\u8a00\u7b80\u5316\u4e86 LM \u7a0b\u5e8f\u7684\u7f16\u7a0b\uff0c\u800c\u8fd0\u884c\u65f6\u5219\u52a0\u901f\u4e86\u7a0b\u5e8f\u7684\u6267\u884c\u3002 \u8fd9\u4e9b\u90e8\u5206\u65e2\u53ef\u4ee5\u534f\u540c\u5de5\u4f5c\u4ee5\u63d0\u9ad8\u6027\u80fd\uff0c\u4e5f\u53ef\u4ee5\u72ec\u7acb\u8fd0\u884c\u3002<\/p>\n

    SGLang \u662f\u4e00\u79cd\u5d4c\u5165 Python \u7684\u7279\u5b9a\u9886\u57df\u8bed\u8a00\uff0c\u63d0\u4f9b\u4e86\u751f\u6210\uff08\u5982 extend\u3001gen\u3001select\uff09\u548c\u5e76\u884c\u63a7\u5236\uff08\u5982 fork\u3001join\uff09\u7684\u539f\u8bed\u3002 \u5b83\u4e0e Python \u7684\u63a7\u5236\u6d41\u548c\u5e93\u517c\u5bb9\uff0c\u5141\u8bb8\u7528\u6237\u4f7f\u7528\u672c\u5730 Python \u8bed\u6cd5\u8f7b\u677e\u5f00\u53d1\u9ad8\u7ea7\u63d0\u793a\u5de5\u4f5c\u6d41\u3002 SGLang \u5305\u62ec\u4e00\u4e2a\u89e3\u91ca\u5668\u548c\u4e00\u4e2a\u7f16\u8bd1\u5668\u3002 \u89e3\u91ca\u5668\u4ee5\u6d41\u7684\u5f62\u5f0f\u7ba1\u7406\u63d0\u793a\u72b6\u6001\uff0c\u5e76\u5c06\u539f\u59cb\u64cd\u4f5c\u63d0\u4ea4\u5230\u6d41\u4e2d\u8fdb\u884c\u5f02\u6b65\u6267\u884c\uff0c\u4ece\u800c\u786e\u4fdd\u5bf9\u540c\u6b65\u548c\u7a0b\u5e8f\u5185\u5e76\u884c\u6027\u7684\u9002\u5f53\u63a7\u5236\u3002 \u6b64\u5916\uff0cSGLang \u7a0b\u5e8f\u8fd8\u53ef\u4ee5\u8fdb\u884c\u8ddf\u8e2a\u548c\u7f16\u8bd1\uff0c\u4ee5\u4fbf\u8fdb\u4e00\u6b65\u4f18\u5316\u3002SGLang \u7684\u8fd0\u884c\u65f6\u63d0\u51fa\u4e86\u51e0\u79cd\u65b0\u9896\u7684\u4f18\u5316\u65b9\u6cd5\uff0c\u4ee5\u52a0\u901f LM \u7a0b\u5e8f\u7684\u6267\u884c\uff1a<\/p>\n

      \n
    1. RadixAttention\uff1a \u8be5\u6280\u672f\u53ef\u5728\u591a\u6b21\u751f\u6210\u8c03\u7528\u4e2d\u81ea\u52a8\u91cd\u590d\u4f7f\u7528 KV \u7f13\u5b58\u3002 \u5728\u73b0\u6709\u7684\u63a8\u7406\u5f15\u64ce\u4e2d\uff0c\u8bf7\u6c42\u7684 KV \u7f13\u5b58\u4f1a\u5728\u5904\u7406\u540e\u88ab\u4e22\u5f03\uff0c\u4ece\u800c\u65e0\u6cd5\u5728\u591a\u6b21\u8c03\u7528\u4e2d\u91cd\u590d\u4f7f\u7528\uff0c\u5e76\u51cf\u6162\u6267\u884c\u901f\u5ea6\u3002 SGLang \u5728\u5f27\u5ea6\u6811\u4e2d\u7ef4\u62a4 KV \u7f13\u5b58\u7684 LRU \u7f13\u5b58\uff0c\u50cf\u7ba1\u7406\u4f20\u7edf\u7f13\u5b58\u4e00\u6837\u7ba1\u7406 KV \u7f13\u5b58\uff0c\u5e76\u4f7f\u7528\u5f27\u5ea6\u6811\u8fdb\u884c\u9ad8\u6548\u5339\u914d\u3001\u63d2\u5165\u548c\u9a71\u9010\u3002 \u8fd9\u6837\uff0c\u8fd0\u884c\u65f6\u5c31\u80fd\u9ad8\u6548\u5904\u7406\u5404\u79cd\u91cd\u7528\u6a21\u5f0f\u3002<\/li>\n
    2. \u538b\u7f29\u6709\u9650\u72b6\u6001\u673a\uff1a \u8fd9\u79cd\u6280\u672f\u53ef\u4ee5\u66f4\u5feb\u5730\u5bf9\u7ed3\u6784\u5316\u8f93\u51fa\u8fdb\u884c\u7ea6\u675f\u89e3\u7801\u3002 \u73b0\u6709\u7684\u7cfb\u7edf\u53ea\u80fd\u9075\u5faa\u4e0b\u4e00\u4e2a\u6807\u8bb0\u7684\u7ea6\u675f\uff0c\u56e0\u6b64\u6bcf\u6b21\u53ea\u80fd\u89e3\u7801\u4e00\u4e2a\u6807\u8bb0\u3002 \u53d6\u800c\u4ee3\u4e4b\u7684\u662f\uff0cSGLang \u4f1a\u5206\u6790\u7ea6\u675f\u6761\u4ef6\uff0c\u5e76\u6784\u5efa\u4e00\u4e2a\u538b\u7f29\u6709\u9650\u72b6\u6001\u673a\u6765\u8868\u793a\u8fd9\u4e9b\u7ea6\u675f\u6761\u4ef6\uff0c\u5c3d\u53ef\u80fd\u5c06\u591a\u6807\u8bb0\u8def\u5f84\u538b\u7f29\u4e3a\u5355\u6b65\u8def\u5f84\uff0c\u4ece\u800c\u4e00\u6b21\u89e3\u7801\u591a\u4e2a\u6807\u8bb0\uff0c\u52a0\u5feb\u89e3\u7801\u901f\u5ea6\u3002<\/li>\n
    3. API \u6295\u673a\u6267\u884c\uff1a \u5bf9\u4e8e\u7eaf API \u6a21\u578b\uff0c\u5982OpenAI \u7684 GPT-4<\/a>SGLang \u5f15\u5165\u4e86 API \u6295\u673a\u6267\u884c\uff0c\u4ee5\u4f18\u5316\u591a\u8c03\u7528\u7a0b\u5e8f\u3002<\/li>\n<\/ol>\n

      \u5229\u7528 SGLang \u5b9e\u73b0\u4e86\u5404\u79cd LLM \u5e94\u7528\u7a0b\u5e8f\uff0c\u5305\u62ec\u4ee3\u7406\u63a7\u5236\u3001\u903b\u8f91\u63a8\u7406\u3001\u5c11\u91cf\u5b66\u4e60\u57fa\u51c6\u3001JSON \u89e3\u7801\u3001\u68c0\u7d22\u589e\u5f3a\u751f\u6210\u7ba1\u9053\u3001\u591a\u8f6e\u804a\u5929\u548c\u591a\u6a21\u6001\u5904\u7406\u3002 \u5728\u82f1\u4f1f\u8fbe A10G \u548c A100 GPU \u4e0a\u5bf9 Llama-7B\/70B\u3001Mistral-8x7B\u3001LLaVA-v1.5-7B\uff08\u56fe\u50cf\uff09\u548c LLaVA-NeXT-34B\uff08\u89c6\u9891\uff09\u7b49\u6a21\u578b\u8fdb\u884c\u4e86\u6027\u80fd\u6d4b\u8bd5\u3002 \u5b9e\u9a8c\u7ed3\u679c\u8868\u660e\uff0c\u4e0e\u73b0\u6709\u7684\u7f16\u7a0b\u548c\u63a8\u7406\u7cfb\u7edf\uff08\u5305\u62ec Guidance\u3001vLLM \u548c LMQL\uff09\u76f8\u6bd4\uff0cSGLang \u5728\u5404\u79cd\u5de5\u4f5c\u8d1f\u8f7d\u3001\u6a21\u578b\u548c\u786c\u4ef6\u8bbe\u7f6e\u4e2d\u7684\u541e\u5410\u91cf\u6700\u9ad8\u63d0\u9ad8\u4e86 6.4 \u500d\u3002<\/p>\n

      SGLang<\/strong>\uff1a<\/strong> \u7f16\u7a0b\u6a21\u578b\u548c\u65b9\u6cd5<\/strong><\/p>\n

      \u901a\u8fc7\u4e00\u4e2a\u8fd0\u884c\u793a\u4f8b\u4ecb\u7ecd\u4e86 SGLang \u7f16\u7a0b\u6a21\u578b\uff0c\u63cf\u8ff0\u4e86\u5176\u8bed\u8a00\u57fa\u5143\u548c\u6267\u884c\u6a21\u5f0f\uff0c\u5e76\u6982\u8ff0\u4e86\u8fd0\u884c\u65f6\u4f18\u5316\u673a\u4f1a\u3002 \u8be5\u6a21\u578b\u901a\u8fc7\u63d0\u4f9b\u7075\u6d3b\u3001\u53ef\u7ec4\u5408\u7684\u57fa\u5143\uff0c\u7b80\u5316\u4e86\u591a\u8c03\u7528\u5de5\u4f5c\u6d41\u4e2d\u7684\u7e41\u7410\u64cd\u4f5c\uff08\u5982\u5b57\u7b26\u4e32\u64cd\u4f5c\u3001API \u8c03\u7528\u3001\u7ea6\u675f\u89c4\u8303\u3001\u5e76\u884c\u6027\uff09\u3002 SGLang \u662f\u4e00\u79cd\u5d4c\u5165 Python \u7684\u7279\u5b9a\u9886\u57df\u8bed\u8a00\u3002 \u4e0b\u56fe\u663e\u793a\u4e86\u4e00\u4e2a\u4f7f\u7528\u5206\u652f-\u6c42\u89e3-\u5408\u5e76\u63d0\u793a\u6cd5\u8bc4\u4f30\u4e00\u7bc7\u5173\u4e8e\u56fe\u50cf\u7684\u6587\u7ae0\u7684\u7a0b\u5e8f\u3002<\/p>\n

      \u51fd\u6570multi_dimensional_judge<\/strong>\u63a5\u53d7\u4e09\u4e2a\u53c2\u6570\uff1a`s`<\/strong>\u3001<\/strong>`path`<\/strong>\u548c<\/strong>`essay`<\/strong>\u3002s \u7ba1\u7406\u63d0\u793a\u72b6\u6001\uff0cpath \u662f\u56fe\u50cf\u6587\u4ef6\u8def\u5f84\uff0cessay \u662f\u6587\u7ae0\u6587\u672c\u3002 \u53ef\u4ee5\u4f7f\u7528+=<\/strong>\u64cd\u4f5c\u7b26\u5c06\u65b0\u5b57\u7b26\u4e32\u548c SGLang \u57fa\u5143\u6dfb\u52a0\u5230\u72b6\u6001 s \u4e2d\u6267\u884c\u3002 \u9996\u5148\uff0c\u51fd\u6570\u5c06\u56fe\u7247\u548c\u6587\u7ae0\u6dfb\u52a0\u5230\u63d0\u793a\u4e2d\u3002 \u7136\u540e\uff0c\u5b83\u4f7f\u7528 select \u68c0\u67e5\u6587\u7ae0\u662f\u5426\u4e0e\u56fe\u7247\u76f8\u5173\uff0c\u5e76\u5c06\u7ed3\u679c\u5b58\u50a8\u5728s[“related”]<\/strong>\u4e2d\u3002 \u5982\u679c\u76f8\u5173\uff0c\u5219\u4f7f\u7528 gen \u5c06\u63d0\u793a\u5206\u53c9\u4e3a\u4e09\u4e2a\u526f\u672c\uff0c\u4ee5\u4fbf\u4ece\u4e0d\u540c\u7ef4\u5ea6\u8fdb\u884c\u5e76\u884c\u8bc4\u4f30\uff0c\u5e76\u5c06\u7ed3\u679c\u5b58\u50a8\u5728f[“judgment”]<\/strong>\u4e2d\u3002 \u63a5\u4e0b\u6765\uff0c\u5b83\u4f1a\u5408\u5e76\u5224\u65ad\uff0c\u751f\u6210\u6458\u8981\uff0c\u5e76\u7ed9\u51fa\u4e00\u4e2a\u5b57\u6bcd\u7b49\u7ea7\u3002 \u6700\u540e\uff0c\u5b83\u6309\u7167\u6b63\u5219\u8868\u8fbe\u5f0f\u7ea6\u675fregex<\/strong>\u6240\u5b9a\u4e49\u7684\u6a21\u5f0f\uff0c\u4ee5 JSON \u683c\u5f0f\u8fd4\u56de\u7ed3\u679c\u3002 SGLang \u6781\u5927\u5730\u7b80\u5316\u4e86\u8fd9\u4e00\u7a0b\u5e8f\uff0c\u56e0\u4e3a\u5982\u679c\u4f7f\u7528\u7c7b\u4f3c OpenAI API \u7684\u63a5\u53e3\uff0c\u7531\u4e8e\u9700\u8981\u624b\u52a8\u64cd\u4f5c\u5b57\u7b26\u4e32\u548c\u63a7\u5236\u5e76\u884c\u6027\uff0c\u540c\u7b49\u7a0b\u5e8f\u9700\u8981\u7684\u4ee3\u7801\u884c\u6570\u5c06\u662f OpenAI API \u7684 2.1 \u500d\u3002<\/p>\n

      SGLang \u63d0\u4f9b\u4e86\u7528\u4e8e\u63a7\u5236\u63d0\u793a\u72b6\u6001\u3001\u751f\u6210\u548c\u5e76\u884c\u6027\u7684\u539f\u8bed\uff0c\u53ef\u4e0e Python \u8bed\u6cd5\u548c\u5e93\u914d\u5408\u4f7f\u7528\u3002 \u4ee5\u4e0b\u662f\u8fd9\u4e9b\u539f\u8bed\uff1a<\/p>\n

      gen:<\/strong>\u8c03\u7528\u4e00\u4e2a\u6a21\u578b\u6765\u751f\u6210\uff0c\u5e76\u5c06\u7ed3\u679c\u5b58\u50a8\u5728\u4e00\u4e2a\u53d8\u91cf\u4e2d\uff0c\u53d8\u91cf\u540d\u5728\u7b2c\u4e00\u4e2a\u53c2\u6570\u4e2d\u6307\u5b9a\u3002 \u5b83\u652f\u6301\u4e00\u4e2a `regex` \u53c2\u6570\uff0c\u7528\u4e8e\u9650\u5236\u8f93\u51fa\u9075\u5faa\u6b63\u5219\u8868\u8fbe\u5f0f\uff08\u4f8b\u5982 JSON \u6a21\u5f0f\uff09\u5b9a\u4e49\u7684\u8bed\u6cd5\u3002<\/p>\n

        \n
      1. \u9009\u62e9\uff1a \u8c03\u7528\u6a21\u578b\u4ece\u5217\u8868\u4e2d\u9009\u62e9\u6982\u7387\u6700\u9ad8\u7684\u9009\u9879\u3002<\/li>\n
      2. += \u6216\u6269\u5c55\uff1a \u5728\u63d0\u793a\u7b26\u540e\u6dfb\u52a0\u5b57\u7b26\u4e32\u3002<\/li>\n
      3. [\u53d8\u91cf\u540d]\uff1a \u83b7\u53d6\u751f\u6210\u7ed3\u679c\u3002<\/li>\n
      4. \u5206\u53c9 \u521b\u5efa\u63d0\u793a\u72b6\u6001\u7684\u5e76\u884c\u5206\u53c9\u3002<\/li>\n
      5. join\uff1a \u91cd\u65b0\u52a0\u5165\u63d0\u793a\u72b6\u6001\u3002<\/li>\n
      6. \u56fe\u50cf\u548c\u89c6\u9891 \u63a5\u6536\u56fe\u50cf\u548c\u89c6\u9891\u8f93\u5165\u3002<\/li>\n<\/ol>\n

        \u6267\u884c SGLang \u7a0b\u5e8f\u7684\u6700\u7b80\u5355\u65b9\u6cd5\u662f\u901a\u8fc7\u89e3\u91ca\u5668\uff0c\u5728\u89e3\u91ca\u5668\u4e2d\uff0c\u63d0\u793a\u7b26\u88ab\u89c6\u4e3a\u5f02\u6b65\u6d41\u3002 \u50cfextend, gen <\/strong>\u548c<\/strong> select<\/strong>\u8fd9\u6837\u7684\u539f\u8bed\u88ab\u63d0\u4ea4\u5230\u6d41\u4e2d\u8fdb\u884c\u5f02\u6b65\u6267\u884c\u3002 \u8fd9\u4e9b\u975e\u963b\u585e\u8c03\u7528\u5141\u8bb8 Python \u4ee3\u7801\u5728\u4e0d\u7b49\u5f85\u751f\u6210\u5b8c\u6210\u7684\u60c5\u51b5\u4e0b\u7ee7\u7eed\u8fd0\u884c\uff0c\u7c7b\u4f3c\u4e8e\u5f02\u6b65\u542f\u52a8 CUDA \u5185\u6838\u3002 \u6bcf\u4e2a\u63d0\u793a\u90fd\u7531\u540e\u53f0\u7ebf\u7a0b\u4e2d\u7684\u6d41\u6267\u884c\u5668\u7ba1\u7406\uff0c\u4ece\u800c\u5b9e\u73b0\u4e86\u7a0b\u5e8f\u5185\u90e8\u7684\u5e76\u884c\u6027\u3002 \u5728\u751f\u6210\u7ed3\u679c\u51c6\u5907\u5c31\u7eea\u4e4b\u524d\uff0c\u83b7\u53d6\u751f\u6210\u7ed3\u679c\u7684\u8fc7\u7a0b\u5c06\u88ab\u963b\u585e\uff0c\u4ee5\u786e\u4fdd\u6b63\u786e\u7684\u540c\u6b65\u3002 \u53e6\u5916\uff0cSGLang \u7a0b\u5e8f\u4e5f\u53ef\u4ee5\u7f16\u8bd1\u4e3a\u8ba1\u7b97\u56fe\uff0c\u5e76\u4f7f\u7528\u56fe\u6267\u884c\u5668\u6267\u884c\uff0c\u4ece\u800c\u5b9e\u73b0\u66f4\u591a\u4f18\u5316\u3002 \u672c\u6587\u9ed8\u8ba4\u4f7f\u7528\u89e3\u91ca\u5668\u6a21\u5f0f\uff0c\u5e76\u5728\u9644\u5f55 D \u4e2d\u8ba8\u8bba\u7f16\u8bd1\u5668\u6a21\u5f0f\u7684\u7ed3\u679c\u3002 SGLang \u901a\u8fc7\u81ea\u5df1\u7684 SGLang Runtime (SRT) \u652f\u6301\u5f00\u653e\u91cd\u91cf\u6a21\u578b\uff0c\u4e5f\u652f\u6301 API \u6a21\u578b\uff0c\u5982\u00a0OpenAI<\/a>\u00a0\u548c\u4eba\u7c7b\u5b66\u6a21\u578b\u3002<\/p>\n

        LLMs \u7684\u7f16\u7a0b\u7cfb\u7edf\u53ef\u5206\u4e3a\u9ad8\u7ea7\uff08\u5982 LangChain\u3001DSPy\uff09\u548c\u4f4e\u7ea7\uff08\u5982 LMQL\u3001Guidance\u3001SGLang\uff09\u3002 \u9ad8\u7ea7\u7cfb\u7edf\u63d0\u4f9b\u9884\u5b9a\u4e49\u6216\u81ea\u52a8\u751f\u6210\u7684\u63d0\u793a\uff0c\u5982 DSPy \u7684\u63d0\u793a\u4f18\u5316\u5668\u3002 \u4f4e\u7ea7\u7cfb\u7edf\u901a\u5e38\u4e0d\u6539\u53d8\u63d0\u793a\u8bed\uff0c\u4f46\u5141\u8bb8\u76f4\u63a5\u64cd\u4f5c\u63d0\u793a\u8bed\u548c\u57fa\u5143\u3002 SGLang \u662f\u4e00\u79cd\u7c7b\u4f3c\u4e8e LMQL \u548c Guidance \u7684\u4f4e\u7ea7\u7cfb\u7edf\u3002 \u4e0b\u8868\u6bd4\u8f83\u4e86\u5b83\u4eec\u7684\u7279\u70b9\u3002<\/p>\n

        SGLang \u66f4\u6ce8\u91cd\u8fd0\u884c\u65f6\u7684\u6548\u7387\uff0c\u5b83\u6709\u81ea\u5df1\u5171\u540c\u8bbe\u8ba1\u7684\u8fd0\u884c\u65f6\uff0c\u53ef\u4ee5\u8fdb\u884c\u65b0\u9896\u7684\u4f18\u5316\u3002 \u9ad8\u7ea7\u8bed\u8a00\uff08\u5982 DSPy\uff09\u53ef\u7f16\u8bd1\u4e3a\u4f4e\u7ea7\u8bed\u8a00\uff08\u5982 SGLang\uff09\u3002 \u7a0d\u540e\u5c06\u6f14\u793a\u5982\u4f55\u5c06 SGLang \u4f5c\u4e3a\u540e\u7aef\u96c6\u6210\u5230 DSPy \u4e2d\uff0c\u4ee5\u63d0\u9ad8\u8fd0\u884c\u6548\u7387\u3002<\/p>\n

        \u4e0a\u8ff0\u793a\u4f8b\u5c55\u793a\u4e86\u91c7\u7528 LRU \u9a71\u9010\u7b56\u7565\u7684 RadixAttention \u5728\u4e5d\u4e2a\u65f6\u95f4\u70b9\u4e0a\u7684\u8fd0\u884c\u60c5\u51b5\uff0c\u5c55\u793a\u4e86\u5f27\u5ea6\u6811\u5728\u54cd\u5e94\u5404\u79cd\u8bf7\u6c42\u65f6\u7684\u52a8\u6001\u6f14\u5316\u3002 \u8fd9\u4e9b\u8bf7\u6c42\u5305\u62ec\u4e24\u4e2a\u804a\u5929\u4f1a\u8bdd\u3001\u4e00\u6279\u5c11\u91cf\u5b66\u4e60\u67e5\u8be2\u548c\u81ea\u4e00\u81f4\u6027\u91c7\u6837\u3002 \u6bcf\u6761\u6811\u8fb9\u90fd\u5e26\u6709\u4e00\u4e2a\u6807\u7b7e\uff0c\u8868\u793a\u4e00\u4e2a\u5b50\u4e32\u6216\u4e00\u4e2a\u6807\u8bb0\u5e8f\u5217\u3002 \u8282\u70b9\u7528\u989c\u8272\u7f16\u7801\u4ee5\u53cd\u6620\u4e0d\u540c\u7684\u72b6\u6001\uff1a\u7eff\u8272\u4ee3\u8868\u65b0\u6dfb\u52a0\u7684\u8282\u70b9\uff0c\u84dd\u8272\u4ee3\u8868\u5728\u65f6\u95f4\u70b9\u671f\u95f4\u8bbf\u95ee\u8fc7\u7684\u7f13\u5b58\u8282\u70b9\uff0c\u7ea2\u8272\u4ee3\u8868\u5df2\u88ab\u9a71\u9010\u7684\u8282\u70b9\u3002<\/p>\n

        \u6b65\u9aa4<\/strong> 1<\/strong>\uff1a<\/strong>\u534a\u5f84\u6811\u6700\u521d\u4e3a\u7a7a\u3002<\/p>\n

        \u6b65\u9aa4<\/strong> 2<\/strong>\uff1a<\/strong>\u670d\u52a1\u5668\u5904\u7406\u4f20\u5165\u7684\u7528\u6237\u4fe1\u606f “Hello”\uff0c\u5e76\u4ee5 LLM \u8f93\u51fa “Hi “\u4f5c\u4e3a\u56de\u590d\u3002 \u7cfb\u7edf\u63d0\u793a “\u60a8\u662f\u4e00\u4f4d\u4e50\u4e8e\u52a9\u4eba\u7684\u52a9\u624b”\u3001\u7528\u6237\u4fe1\u606f “\u60a8\u597d\uff01”\u548c LLM \u56de\u590d “\u60a8\u597d\uff01”\u88ab\u5408\u5e76\u5230\u6811\u4e2d\uff0c\u6210\u4e3a\u4e00\u6761\u4e0e\u65b0\u8282\u70b9\u76f8\u8fde\u7684\u8fb9\u3002<\/p>\n

        \u6b65\u9aa4<\/strong> 3<\/strong>\uff1a<\/strong>\u4e00\u4e2a\u65b0\u7684\u63d0\u793a\u5230\u8fbe\uff0c\u670d\u52a1\u5668\u5728\u5f27\u5ea6\u6811\u4e2d\u627e\u5230\u8be5\u63d0\u793a\u7684\u524d\u7f00\uff08\u5373\u5bf9\u8bdd\u7684\u7b2c\u4e00\u8f6e\uff09\uff0c\u5e76\u91cd\u65b0\u4f7f\u7528\u5176 KV \u7f13\u5b58\u3002 \u65b0\u7684\u4e00\u8f6e\u4f5c\u4e3a\u4e00\u4e2a\u65b0\u8282\u70b9\u9644\u52a0\u5230\u6811\u4e2d\u3002<\/p>\n

        \u6b65\u9aa4<\/strong> 4<\/strong>\uff1a<\/strong>\u5f00\u59cb\u65b0\u7684\u804a\u5929\u4f1a\u8bdd\u3002 \u6b65\u9aa4 3 \u4e2d\u7684\u8282\u70b9\u88ab\u5206\u6210\u4e24\u4e2a\u8282\u70b9\uff0c\u4ee5\u4fbf\u4e24\u4e2a\u804a\u5929\u4f1a\u8bdd\u5171\u4eab\u7cfb\u7edf\u63d0\u793a\u3002<\/p>\n

        \u6b65\u9aa4<\/strong> 5<\/strong>\uff1a<\/strong>\u7b2c\u4e8c\u4e2a\u804a\u5929\u4f1a\u8bdd\u7ee7\u7eed\u8fdb\u884c\u3002 \u4f46\u662f\uff0c\u7531\u4e8e\u5185\u5b58\u9650\u5236\uff0c\u5fc5\u987b\u5220\u9664\u6b65\u9aa4 4 \u4e2d\u7684\u4e00\u4e2a\u8282\u70b9\u3002 \u65b0\u7684\u8f6c\u6298\u70b9\u88ab\u6dfb\u52a0\u5230\u6b65\u9aa4 4 \u7684\u5269\u4f59\u8282\u70b9\u4e4b\u540e\u3002<\/p>\n

        \u6b65\u9aa4<\/strong> 6<\/strong>\uff1a<\/strong>\u670d\u52a1\u5668\u63a5\u6536\u4e00\u4e2a\u5c11\u91cf\u5b66\u4e60\u67e5\u8be2\uff0c\u5bf9\u5176\u8fdb\u884c\u5904\u7406\u5e76\u5c06\u5176\u63d2\u5165\u6811\u4e2d\u3002 \u6839\u8282\u70b9\u88ab\u62c6\u5206\uff0c\u56e0\u4e3a\u65b0\u67e5\u8be2\u4e0e\u73b0\u6709\u8282\u70b9\u4e0d\u5171\u4eab\u4efb\u4f55\u524d\u7f00\u3002<\/p>\n

        \u6b65\u9aa4<\/strong> 7<\/strong>\uff1a<\/strong>\u670d\u52a1\u5668\u6536\u5230\u4e00\u6279\u989d\u5916\u7684\u5c11\u91cf\u5b66\u4e60\u67e5\u8be2\u3002 \u8fd9\u4e9b\u67e5\u8be2\u5171\u4eab\u540c\u4e00\u7ec4\u5c11\u91cf\u793a\u4f8b\uff0c\u56e0\u6b64\u4f1a\u4ece\u6b65\u9aa4 6 \u4e2d\u62c6\u5206\u4e00\u4e2a\u8282\u70b9\u4ee5\u5b9e\u73b0\u5171\u4eab\u3002<\/p>\n

        \u6b65\u9aa4<\/strong> 8<\/strong>\uff1a<\/strong>\u670d\u52a1\u5668\u6536\u5230\u6765\u81ea\u7b2c\u4e00\u4e2a\u804a\u5929\u4f1a\u8bdd\u7684\u65b0\u6d88\u606f\u3002 \u5b83\u4f1a\u9a71\u9010\u7b2c\u4e8c\u4e2a\u804a\u5929\u4f1a\u8bdd\u4e2d\u7684\u6240\u6709\u8282\u70b9\uff0c\u56e0\u4e3a\u5b83\u4eec\u662f\u6700\u8fd1\u4f7f\u7528\u6700\u5c11\u7684\u3002<\/p>\n

        \u6b65\u9aa4<\/strong> 9<\/strong>\uff1a<\/strong>\u670d\u52a1\u5668\u4f1a\u6536\u5230\u4e3a\u6b65\u9aa4 8 \u4e2d\u7684\u8282\u70b9\u4e2d\u7684\u95ee\u9898\u91c7\u6837\u66f4\u591a\u7b54\u6848\u7684\u8bf7\u6c42\uff0c\u8fd9\u53ef\u80fd\u662f\u4e3a\u4e86\u81ea\u6211\u4e00\u81f4\u6027\u63d0\u793a\u3002 \u4e3a\u4e86\u7ed9\u8fd9\u4e9b\u8bf7\u6c42\u817e\u51fa\u7a7a\u95f4\uff0c\u591a\u4e2a\u8282\u70b9\u88ab\u9010\u51fa\u3002<\/p>\n

        \u672c\u4f8b\u6f14\u793a\u4e86 RadixAttention \u5982\u4f55\u6839\u636e\u4e0d\u540c\u7c7b\u578b\u7684\u8bf7\u6c42\u52a8\u6001\u5206\u914d\u548c\u9a71\u9010\u8282\u70b9\uff0c\u786e\u4fdd\u9ad8\u6548\u7684 KV \u7f13\u5b58\u91cd\u7528\u548c\u5185\u5b58\u7ba1\u7406\u3002<\/p>\n

        SGLang<\/strong>\uff1a\u8bc4\u4f30\u548c\u7ed3\u679c<\/strong><\/p>\n

        \u5f00\u653e\u91cd\u91cf\u6a21\u578b\u7684\u7ed3\u679c<\/strong><\/p>\n

        \u5ef6\u8fdf\u548c\u541e\u5410\u91cf\u7ed3\u679c\u5982\u4e0b\u56fe\u6240\u793a\u3002 SGLang \u5c06\u541e\u5410\u91cf\u63d0\u9ad8\u4e86 6.4 \u500d\uff0c\u5c06\u5ef6\u8fdf\u964d\u4f4e\u4e86 3.7 \u500d\u3002 \u8fd9\u4e9b\u6539\u8fdb\u5f97\u76ca\u4e8e KV \u7f13\u5b58\u7684\u91cd\u590d\u4f7f\u7528\u3001\u5355\u4e2a\u7a0b\u5e8f\u5185\u5e76\u884c\u6027\u7684\u5229\u7528\u4ee5\u53ca\u66f4\u5feb\u7684\u53d7\u9650\u89e3\u7801\u3002<\/p>\n

        \u5728\u8fd9\u4e9b\u57fa\u51c6\u6d4b\u8bd5\u4e2d\uff0c\u7f13\u5b58\u547d\u4e2d\u7387\u4ece 50% \u5230 99% \u4e0d\u7b49\u3002 \u56fe 13\uff08\u9644\u5f55\uff09\u5217\u51fa\u4e86\u6240\u6709\u57fa\u51c6\u7684\u5df2\u8fbe\u5230\u548c\u6700\u4f73\u7f13\u5b58\u547d\u4e2d\u7387\uff0c\u8868\u660e SGLang \u7684\u7f13\u5b58\u611f\u77e5\u8c03\u5ea6\u5e73\u5747\u63a5\u8fd1 96% \u7684\u6700\u4f73\u547d\u4e2d\u7387\u3002<\/p>\n

        \u5728\u5177\u6709\u5f20\u91cf\u5e76\u884c\u6027\u7684\u5927\u578b\u6a21\u578b\u4e0a\u7684\u7ed3\u679c<\/strong><\/p>\n

        \u5728\u540c\u4e00\u7ec4\u57fa\u51c6\u6d4b\u8bd5\u4e2d\uff0c\u5bf9\u8f83\u5927\u7684\u6a21\u578b Mixtral-8x7B \u548c Llama-70B \u8fdb\u884c\u4e86\u5f20\u91cf\u5e76\u884c\u6d4b\u8bd5\uff0c\u7ed3\u679c\u5982\u4e0b\u56fe\u6240\u793a\u3002 \u5927\u578b\u6a21\u578b\u7684\u63d0\u901f\u8d8b\u52bf\u4e0e\u5c0f\u578b\u6a21\u578b\u7c7b\u4f3c\uff0c\u8fd9\u8868\u660e SGLang \u7684\u4f18\u5316\u5bf9\u5927\u578b\u6a21\u578b\u5177\u6709\u826f\u597d\u7684\u666e\u9002\u6027\u3002 \u7531\u4e8e\u7f3a\u4e4f\u9ad8\u6548\u7684\u5f20\u91cf\u5e76\u884c\u5b9e\u73b0\uff0cGuidance \u548c LMQL \u88ab\u7701\u7565\u3002<\/p>\n

        \u591a\u6a21\u6001\u6a21\u578b\u7684\u7ed3\u679c<\/strong><\/p>\n

        SGLang \u672c\u673a\u652f\u6301\u56fe\u50cf\u548c\u89c6\u9891\u57fa\u5143\u7684\u591a\u6a21\u6001\u6a21\u578b\u3002 \u672c\u6587\u7684\u4f18\u5316\u4e0e\u591a\u6a21\u6001\u6a21\u578b\u517c\u5bb9\u3002 \u5bf9\u4e8e RadixAttention\uff0c\u8ba1\u7b97\u8f93\u5165\u56fe\u50cf\u7684\u54c8\u5e0c\u503c\u5e76\u5c06\u5176\u7528\u4f5c radix \u6811\u4e2d\u7684\u5bc6\u94a5\uff0c\u4ece\u800c\u5141\u8bb8\u91cd\u590d\u4f7f\u7528\u6765\u81ea\u540c\u4e00\u56fe\u50cf\u7684 KV \u56fe\u50cf\u6807\u8bb0\u7f13\u5b58\u3002 LLaVA-v1.5-7B \uff08\u56fe\u50cf\uff09\u5728 llava-bench-in-the-wild \u4e0a\u8fd0\u884c\uff0cLLaVA-NeXT-34B\uff08\u89c6\u9891\uff09\u5728 ActivityNet \u4e0a\u8fd0\u884c\u3002 \u7531\u4e8e\u8fd9\u4e9b\u6a21\u578b\u6ca1\u6709\u5f97\u5230\u5176\u4ed6\u57fa\u7ebf\u7cfb\u7edf\u7684\u826f\u597d\u652f\u6301\uff0c\u56e0\u6b64\u5c06\u6a21\u578b\u4f5c\u8005\u5728 Hugging Face Transformers \u4e2d\u7684\u539f\u59cb\u5b9e\u73b0\u4f5c\u4e3a\u57fa\u7ebf\u3002 \u5982\u4e0b\u8868\u6240\u793a\uff0cSGLang \u5728\u8fd9\u4e9b\u57fa\u51c6\u6d4b\u8bd5\u4e2d\u7684\u541e\u5410\u91cf\u6700\u9ad8\u63d0\u9ad8\u4e86 6 \u500d\u3002 \u5728llava-bench-in-the-wild\u4e2d\uff0c\u5904\u7406\u4e86\u5173\u4e8e\u540c\u4e00\u56fe\u50cf\u7684\u591a\u4e2a\u95ee\u9898\uff0cSGLang\u8fd0\u884c\u65f6\u5728\u8fd9\u79cd\u60c5\u51b5\u4e0b\u91cd\u590d\u4f7f\u7528\u4e86KV\u7f13\u5b58\u3002<\/p>\n

        \u751f\u4ea7\u90e8\u7f72<\/strong><\/p>\n

        SGLang \u5df2\u90e8\u7f72\u5728 Chatbot Arena \u4e2d\uff0c\u4e3a\u5f00\u653e\u5f0f\u6a21\u578b\u63d0\u4f9b\u670d\u52a1\u3002 \u7531\u4e8e\u67d0\u4e9b\u6a21\u578b\u7684\u6d41\u91cf\u8f83\u4f4e\uff0c\u56e0\u6b64\u6bcf\u4e2a\u6a21\u578b\u53ea\u7531\u4e00\u540d SGLang \u5de5\u4f5c\u8005\u63d0\u4f9b\u670d\u52a1\u3002 \u4e00\u4e2a\u6708\u540e\uff0cLLaVA-Next-34B \u7684 RadixAttention \u7f13\u5b58\u547d\u4e2d\u7387\u4e3a 52.4%\uff0cVicuna-33B \u4e3a 74.1%\u3002 \u7f13\u5b58\u547d\u4e2d\u7387\u6765\u81ea\u5e38\u89c1\u7684\u7cfb\u7edf\u6d88\u606f\u3001\u7ecf\u5e38\u91cd\u590d\u4f7f\u7528\u7684\u793a\u4f8b\u56fe\u50cf\u548c\u591a\u8f6e\u804a\u5929\u8bb0\u5f55\u3002 \u8fd9\u4f7f Vicuna-33B \u7684\u9996\u6b21\u6807\u8bb0\u5ef6\u8fdf\u5e73\u5747\u51cf\u5c11\u4e86 1.7 \u500d\u3002<\/p>\n

        \u539f\u6587\u94fe\u63a5<\/a><\/p>\n

        \uff08\u673a\u5668\u7ffb\u8bd1\uff0c\u8f7b\u5ea6\u8bd1\u540e\u7f16\u8f91\uff0c\u4ec5\u4f9b\u53c2\u8003\uff09<\/p>\n

        \u7f16\u8f91\uff1a\u80e1\u8dc3<\/p>\n

         <\/p>\n

         <\/p>\n","protected":false},"excerpt":{"rendered":"

          \u5927\u578b\u8bed\u8a00\u6a21\u578b\uff08LLM\uff09\u8d8a\u6765\u8d8a\u591a\u5730\u7528\u4e8e\u9700\u8981\u591a\u6b21\u751f\u6210\u8c03\u7528\u3001\u9ad8\u7ea7\u63d0\u793a\u6280\u672f\u3001\u63a7\u5236\u6d41\u548c\u7ed3\u6784\u5316\u8f93\u5165\/\u8f93\u51fa\u7684\u590d […]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[391],"tags":[],"class_list":["post-33908","post","type-post","status-publish","format-standard","hentry","category-391"],"_links":{"self":[{"href":"https:\/\/linguaresources.com\/index.php?rest_route=\/wp\/v2\/posts\/33908","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/linguaresources.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/linguaresources.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/linguaresources.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/linguaresources.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=33908"}],"version-history":[{"count":1,"href":"https:\/\/linguaresources.com\/index.php?rest_route=\/wp\/v2\/posts\/33908\/revisions"}],"predecessor-version":[{"id":33909,"href":"https:\/\/linguaresources.com\/index.php?rest_route=\/wp\/v2\/posts\/33908\/revisions\/33909"}],"wp:attachment":[{"href":"https:\/\/linguaresources.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=33908"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/linguaresources.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=33908"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/linguaresources.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=33908"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}