betterdocs
domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init
action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114jnews-view-counter
domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init
action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114wp-statistics
domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init
action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114wpdiscuz
domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init
action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114jnews
域的翻译加载触发过早。这通常表示插件或主题中的某些代码运行过早。翻译应在 init
操作或之后加载。 请查阅调试 WordPress来获取更多信息。 (这个消息是在 6.7.0 版本添加的。) in /data/user/htdocs/wp-includes/functions.php on line 6114jnews-like
域的翻译加载触发过早。这通常表示插件或主题中的某些代码运行过早。翻译应在 init
操作或之后加载。 请查阅调试 WordPress来获取更多信息。 (这个消息是在 6.7.0 版本添加的。) in /data/user/htdocs/wp-includes/functions.php on line 6114<\/p>\n
\u5927\u578b\u8bed\u8a00\u6a21\u578b\uff08LLM\uff09\u8d8a\u6765\u8d8a\u591a\u5730\u7528\u4e8e\u9700\u8981\u591a\u6b21\u751f\u6210\u8c03\u7528\u3001\u9ad8\u7ea7\u63d0\u793a\u6280\u672f\u3001\u63a7\u5236\u6d41\u548c\u7ed3\u6784\u5316\u8f93\u5165\/\u8f93\u51fa\u7684\u590d\u6742\u4efb\u52a1\u3002 \u7136\u800c\uff0c\u76ee\u524d\u8fd8\u7f3a\u4e4f\u7528\u4e8e\u7f16\u7a0b\u548c\u6267\u884c\u8fd9\u4e9b\u5e94\u7528\u7684\u9ad8\u6548\u7cfb\u7edf\u3002 SGLang \u662f\u4e00\u79cd\u65b0\u63a8\u51fa\u7684\u7cfb\u7edf\uff0c\u65e8\u5728\u901a\u8fc7\u63d0\u4f9b\u590d\u6742\u8bed\u8a00\u6a21\u578b\u7a0b\u5e8f\u7684\u9ad8\u6548\u6267\u884c\u6765\u89e3\u51b3\u8fd9\u4e00\u95ee\u9898\u3002 SGLang \u7531\u524d\u7aef\u8bed\u8a00\u548c\u8fd0\u884c\u65f6\u7ec4\u6210\u3002 \u524d\u7aef\u8bed\u8a00\u901a\u8fc7\u7528\u4e8e\u751f\u6210\u548c\u5e76\u884c\u6027\u63a7\u5236\u7684\u57fa\u5143\u7b80\u5316\u4e86\u7f16\u7a0b\uff0c\u800c\u8fd0\u884c\u65f6\u5219\u901a\u8fc7\u65b0\u9896\u7684\u4f18\u5316\u52a0\u901f\u4e86\u6267\u884c\uff0c\u5982\u7528\u4e8eKV\u7f13\u5b58\u91cd\u7528\u7684RadixAttention\u548c\u7528\u4e8e\u66f4\u5feb\u7ed3\u6784\u5316\u8f93\u51fa\u89e3\u7801\u7684\u538b\u7f29\u6709\u9650\u72b6\u6001\u673a\u3002 \u5b9e\u9a8c\u8bc1\u660e\uff0c\u4e0e\u6700\u5148\u8fdb\u7684\u63a8\u7406\u7cfb\u7edf\u76f8\u6bd4\uff0cSGLang \u5728\u5404\u79cd\u5927\u578b\u8bed\u8a00\u548c\u591a\u6a21\u6001\u6a21\u578b\u4e0a\u7684\u541e\u5410\u91cf\u6700\u591a\u53ef\u63d0\u9ad8 6.4 \u500d\uff0c\u53ef\u5904\u7406\u7684\u4efb\u52a1\u5305\u62ec\u4ee3\u7406\u63a7\u5236\u3001\u903b\u8f91\u63a8\u7406\u3001\u5c11\u91cf\u5b66\u4e60\u57fa\u51c6\u3001JSON \u89e3\u7801\u3001\u68c0\u7d22\u589e\u5f3a\u751f\u6210\u6d41\u6c34\u7ebf\u548c\u591a\u8f6e\u804a\u5929\u3002<\/p>\n
LLM \u529f\u80fd\u7684\u6700\u65b0\u8fdb\u5c55\u6269\u5927\u4e86\u5b83\u4eec\u7684\u7528\u9014\uff0c\u4f7f\u5b83\u4eec\u80fd\u591f\u5904\u7406\u66f4\u5e7f\u6cdb\u7684\u4e00\u822c\u4efb\u52a1\uff0c\u5e76\u53d1\u6325\u81ea\u4e3b\u4ee3\u7406\u7684\u529f\u80fd\u3002 \u5728\u8fd9\u4e9b\u5e94\u7528\u4e2d\uff0cLLM \u53c2\u4e0e\u591a\u8f6e\u89c4\u5212\u3001\u63a8\u7406\u4ee5\u53ca\u4e0e\u5916\u90e8\u73af\u5883\u7684\u4ea4\u4e92\u3002 \u8fd9\u53ef\u4ee5\u901a\u8fc7\u5de5\u5177\u4f7f\u7528\u3001\u591a\u79cd\u8f93\u5165\u6a21\u5f0f\u548c\u5404\u79cd\u63d0\u793a\u6280\u672f\uff08\u5982\u5c11\u91cf\u5b66\u4e60\u3001\u81ea\u6211\u4e00\u81f4\u6027\u3001\u601d\u7ef4\u9aa8\u67b6\u548c\u601d\u7ef4\u6811\uff09\u6765\u5b9e\u73b0\u3002 \u8fd9\u4e9b\u65b0\u7684\u4f7f\u7528\u6848\u4f8b\u9700\u8981\u591a\u6b21\uff08\u901a\u5e38\u662f\u4f9d\u8d56\u6027\u7684\uff09LLM \u751f\u6210\u8c03\u7528\uff0c\u8868\u660e\u4e86\u4f7f\u7528\u591a\u8c03\u7528\u7ed3\u6784\u5b8c\u6210\u590d\u6742\u4efb\u52a1\u7684\u8d8b\u52bf\u3002<\/p>\n
\u8fd9\u79cd\u8f6c\u53d8\u6807\u5fd7\u7740 LLM \u4ece\u7b80\u5355\u7684\u804a\u5929\u8fc7\u6e21\u5230\u66f4\u590d\u6742\u7684\u7a0b\u5e8f\u5316\u4f7f\u7528\uff0c\u5373\u7a0b\u5e8f\u5b89\u6392\u548c\u63a7\u5236 LLM \u7684\u751f\u6210\u8fc7\u7a0b\u3002 \u8fd9\u4e9b\u7a0b\u5e8f\u88ab\u79f0\u4e3a “\u8bed\u8a00\u6a21\u578b\u7a0b\u5e8f”\uff08LM Programs\uff09\u3002 \u9ad8\u7ea7\u63d0\u793a\u6280\u672f\u548c\u4ee3\u7406\u5de5\u4f5c\u6d41\u7a0b\u90fd\u5c5e\u4e8e LM \u7a0b\u5e8f\u7684\u8303\u7574\u3002 LM \u7a0b\u5e8f\u6709\u4e24\u4e2a\u5171\u540c\u7279\u6027\uff1a(1) LM \u7a0b\u5e8f\u901a\u5e38\u6d89\u53ca\u591a\u4e2a LLM \u8c03\u7528\uff0c\u4e2d\u95f4\u7a7f\u63d2\u63a7\u5236\u6d41\uff0c\u4ee5\u5b8c\u6210\u590d\u6742\u4efb\u52a1\u5e76\u63d0\u9ad8\u6574\u4f53\u8d28\u91cf\u3002 (2) LM \u7a0b\u5e8f\u63a5\u6536\u7ed3\u6784\u5316\u7684\u8f93\u5165\u5e76\u4ea7\u751f\u7ed3\u6784\u5316\u7684\u8f93\u51fa\uff0c\u8fd9\u4f7f\u5f97 LM \u7a0b\u5e8f\u7684\u7ec4\u6210\u548c\u4e0e\u73b0\u6709\u8f6f\u4ef6\u7cfb\u7edf\u7684\u96c6\u6210\u6210\u4e3a\u53ef\u80fd\u3002<\/p>\n
\u5728\u672c\u6587\u4e2d\uff0c\u6211\u4eec\u5c06\u6df1\u5165\u7814\u7a76 SGLang \u6846\u67b6\uff0c\u63a2\u7d22\u5176\u67b6\u6784\uff0c\u5206\u6790\u5176\u6027\u80fd\uff0c\u5e76\u5c06\u5176\u4e0e\u6700\u5148\u8fdb\u7684\u6846\u67b6\u8fdb\u884c\u6bd4\u8f83\u3002 \u8ba9\u6211\u4eec\u5f00\u59cb\u5427\u3002<\/p>\n
SGLang <\/strong>\u7b80\u4ecb<\/strong><\/p>\n \u5c3d\u7ba1 LM \u7a0b\u5e8f\u88ab\u5e7f\u6cdb\u4f7f\u7528\uff0c\u4f46\u76ee\u524d\u7528\u4e8e\u8868\u8fbe\u548c\u6267\u884c\u8fd9\u4e9b\u7a0b\u5e8f\u7684\u7cfb\u7edf\u4ecd\u7136\u6548\u7387\u4f4e\u4e0b\u3002 SGLang \u786e\u5b9a\u4e86\u4e0e\u9ad8\u6548\u4f7f\u7528 LM \u7a0b\u5e8f\u76f8\u5173\u7684\u4e24\u4e2a\u4e3b\u8981\u6311\u6218\uff1a<\/p>\n \u4e3a\u4e86\u5e94\u5bf9\u8fd9\u4e9b\u6311\u6218\uff0cSGLang \u4e3a LLM \u5f15\u5165\u4e86\u7ed3\u6784\u5316\u751f\u6210\u8bed\u8a00\u3002 \u5176\u6838\u5fc3\u601d\u60f3\u662f\u7cfb\u7edf\u5730\u5229\u7528 LM \u7a0b\u5e8f\u4e2d\u7684\u591a\u8c03\u7528\u7ed3\u6784\uff0c\u4ee5\u5b9e\u73b0\u9ad8\u6548\u6267\u884c\u3002 \u5982\u4e0b\u56fe\u6240\u793a\uff0cSGLang \u5305\u62ec\u4e24\u4e2a\u90e8\u5206\uff1a\u524d\u7aef\u8bed\u8a00\u548c\u540e\u7aef\u8fd0\u884c\u65f6\u3002<\/p>\n \u524d\u7aef\u8bed\u8a00\u7b80\u5316\u4e86 LM \u7a0b\u5e8f\u7684\u7f16\u7a0b\uff0c\u800c\u8fd0\u884c\u65f6\u5219\u52a0\u901f\u4e86\u7a0b\u5e8f\u7684\u6267\u884c\u3002 \u8fd9\u4e9b\u90e8\u5206\u65e2\u53ef\u4ee5\u534f\u540c\u5de5\u4f5c\u4ee5\u63d0\u9ad8\u6027\u80fd\uff0c\u4e5f\u53ef\u4ee5\u72ec\u7acb\u8fd0\u884c\u3002<\/p>\n SGLang \u662f\u4e00\u79cd\u5d4c\u5165 Python \u7684\u7279\u5b9a\u9886\u57df\u8bed\u8a00\uff0c\u63d0\u4f9b\u4e86\u751f\u6210\uff08\u5982 extend\u3001gen\u3001select\uff09\u548c\u5e76\u884c\u63a7\u5236\uff08\u5982 fork\u3001join\uff09\u7684\u539f\u8bed\u3002 \u5b83\u4e0e Python \u7684\u63a7\u5236\u6d41\u548c\u5e93\u517c\u5bb9\uff0c\u5141\u8bb8\u7528\u6237\u4f7f\u7528\u672c\u5730 Python \u8bed\u6cd5\u8f7b\u677e\u5f00\u53d1\u9ad8\u7ea7\u63d0\u793a\u5de5\u4f5c\u6d41\u3002 SGLang \u5305\u62ec\u4e00\u4e2a\u89e3\u91ca\u5668\u548c\u4e00\u4e2a\u7f16\u8bd1\u5668\u3002 \u89e3\u91ca\u5668\u4ee5\u6d41\u7684\u5f62\u5f0f\u7ba1\u7406\u63d0\u793a\u72b6\u6001\uff0c\u5e76\u5c06\u539f\u59cb\u64cd\u4f5c\u63d0\u4ea4\u5230\u6d41\u4e2d\u8fdb\u884c\u5f02\u6b65\u6267\u884c\uff0c\u4ece\u800c\u786e\u4fdd\u5bf9\u540c\u6b65\u548c\u7a0b\u5e8f\u5185\u5e76\u884c\u6027\u7684\u9002\u5f53\u63a7\u5236\u3002 \u6b64\u5916\uff0cSGLang \u7a0b\u5e8f\u8fd8\u53ef\u4ee5\u8fdb\u884c\u8ddf\u8e2a\u548c\u7f16\u8bd1\uff0c\u4ee5\u4fbf\u8fdb\u4e00\u6b65\u4f18\u5316\u3002SGLang \u7684\u8fd0\u884c\u65f6\u63d0\u51fa\u4e86\u51e0\u79cd\u65b0\u9896\u7684\u4f18\u5316\u65b9\u6cd5\uff0c\u4ee5\u52a0\u901f LM \u7a0b\u5e8f\u7684\u6267\u884c\uff1a<\/p>\n \u5229\u7528 SGLang \u5b9e\u73b0\u4e86\u5404\u79cd LLM \u5e94\u7528\u7a0b\u5e8f\uff0c\u5305\u62ec\u4ee3\u7406\u63a7\u5236\u3001\u903b\u8f91\u63a8\u7406\u3001\u5c11\u91cf\u5b66\u4e60\u57fa\u51c6\u3001JSON \u89e3\u7801\u3001\u68c0\u7d22\u589e\u5f3a\u751f\u6210\u7ba1\u9053\u3001\u591a\u8f6e\u804a\u5929\u548c\u591a\u6a21\u6001\u5904\u7406\u3002 \u5728\u82f1\u4f1f\u8fbe A10G \u548c A100 GPU \u4e0a\u5bf9 Llama-7B\/70B\u3001Mistral-8x7B\u3001LLaVA-v1.5-7B\uff08\u56fe\u50cf\uff09\u548c LLaVA-NeXT-34B\uff08\u89c6\u9891\uff09\u7b49\u6a21\u578b\u8fdb\u884c\u4e86\u6027\u80fd\u6d4b\u8bd5\u3002 \u5b9e\u9a8c\u7ed3\u679c\u8868\u660e\uff0c\u4e0e\u73b0\u6709\u7684\u7f16\u7a0b\u548c\u63a8\u7406\u7cfb\u7edf\uff08\u5305\u62ec Guidance\u3001vLLM \u548c LMQL\uff09\u76f8\u6bd4\uff0cSGLang \u5728\u5404\u79cd\u5de5\u4f5c\u8d1f\u8f7d\u3001\u6a21\u578b\u548c\u786c\u4ef6\u8bbe\u7f6e\u4e2d\u7684\u541e\u5410\u91cf\u6700\u9ad8\u63d0\u9ad8\u4e86 6.4 \u500d\u3002<\/p>\n SGLang<\/strong>\uff1a<\/strong> \u7f16\u7a0b\u6a21\u578b\u548c\u65b9\u6cd5<\/strong><\/p>\n \u901a\u8fc7\u4e00\u4e2a\u8fd0\u884c\u793a\u4f8b\u4ecb\u7ecd\u4e86 SGLang \u7f16\u7a0b\u6a21\u578b\uff0c\u63cf\u8ff0\u4e86\u5176\u8bed\u8a00\u57fa\u5143\u548c\u6267\u884c\u6a21\u5f0f\uff0c\u5e76\u6982\u8ff0\u4e86\u8fd0\u884c\u65f6\u4f18\u5316\u673a\u4f1a\u3002 \u8be5\u6a21\u578b\u901a\u8fc7\u63d0\u4f9b\u7075\u6d3b\u3001\u53ef\u7ec4\u5408\u7684\u57fa\u5143\uff0c\u7b80\u5316\u4e86\u591a\u8c03\u7528\u5de5\u4f5c\u6d41\u4e2d\u7684\u7e41\u7410\u64cd\u4f5c\uff08\u5982\u5b57\u7b26\u4e32\u64cd\u4f5c\u3001API \u8c03\u7528\u3001\u7ea6\u675f\u89c4\u8303\u3001\u5e76\u884c\u6027\uff09\u3002 SGLang \u662f\u4e00\u79cd\u5d4c\u5165 Python \u7684\u7279\u5b9a\u9886\u57df\u8bed\u8a00\u3002 \u4e0b\u56fe\u663e\u793a\u4e86\u4e00\u4e2a\u4f7f\u7528\u5206\u652f-\u6c42\u89e3-\u5408\u5e76\u63d0\u793a\u6cd5\u8bc4\u4f30\u4e00\u7bc7\u5173\u4e8e\u56fe\u50cf\u7684\u6587\u7ae0\u7684\u7a0b\u5e8f\u3002<\/p>\n \u51fd\u6570multi_dimensional_judge<\/strong>\u63a5\u53d7\u4e09\u4e2a\u53c2\u6570\uff1a`s`<\/strong>\u3001<\/strong>`path`<\/strong>\u548c<\/strong>`essay`<\/strong>\u3002s \u7ba1\u7406\u63d0\u793a\u72b6\u6001\uff0cpath \u662f\u56fe\u50cf\u6587\u4ef6\u8def\u5f84\uff0cessay \u662f\u6587\u7ae0\u6587\u672c\u3002 \u53ef\u4ee5\u4f7f\u7528+=<\/strong>\u64cd\u4f5c\u7b26\u5c06\u65b0\u5b57\u7b26\u4e32\u548c SGLang \u57fa\u5143\u6dfb\u52a0\u5230\u72b6\u6001 s \u4e2d\u6267\u884c\u3002 \u9996\u5148\uff0c\u51fd\u6570\u5c06\u56fe\u7247\u548c\u6587\u7ae0\u6dfb\u52a0\u5230\u63d0\u793a\u4e2d\u3002 \u7136\u540e\uff0c\u5b83\u4f7f\u7528 select \u68c0\u67e5\u6587\u7ae0\u662f\u5426\u4e0e\u56fe\u7247\u76f8\u5173\uff0c\u5e76\u5c06\u7ed3\u679c\u5b58\u50a8\u5728s[“related”]<\/strong>\u4e2d\u3002 \u5982\u679c\u76f8\u5173\uff0c\u5219\u4f7f\u7528 gen \u5c06\u63d0\u793a\u5206\u53c9\u4e3a\u4e09\u4e2a\u526f\u672c\uff0c\u4ee5\u4fbf\u4ece\u4e0d\u540c\u7ef4\u5ea6\u8fdb\u884c\u5e76\u884c\u8bc4\u4f30\uff0c\u5e76\u5c06\u7ed3\u679c\u5b58\u50a8\u5728f[“judgment”]<\/strong>\u4e2d\u3002 \u63a5\u4e0b\u6765\uff0c\u5b83\u4f1a\u5408\u5e76\u5224\u65ad\uff0c\u751f\u6210\u6458\u8981\uff0c\u5e76\u7ed9\u51fa\u4e00\u4e2a\u5b57\u6bcd\u7b49\u7ea7\u3002 \u6700\u540e\uff0c\u5b83\u6309\u7167\u6b63\u5219\u8868\u8fbe\u5f0f\u7ea6\u675fregex<\/strong>\u6240\u5b9a\u4e49\u7684\u6a21\u5f0f\uff0c\u4ee5 JSON \u683c\u5f0f\u8fd4\u56de\u7ed3\u679c\u3002 SGLang \u6781\u5927\u5730\u7b80\u5316\u4e86\u8fd9\u4e00\u7a0b\u5e8f\uff0c\u56e0\u4e3a\u5982\u679c\u4f7f\u7528\u7c7b\u4f3c OpenAI API \u7684\u63a5\u53e3\uff0c\u7531\u4e8e\u9700\u8981\u624b\u52a8\u64cd\u4f5c\u5b57\u7b26\u4e32\u548c\u63a7\u5236\u5e76\u884c\u6027\uff0c\u540c\u7b49\u7a0b\u5e8f\u9700\u8981\u7684\u4ee3\u7801\u884c\u6570\u5c06\u662f OpenAI API \u7684 2.1 \u500d\u3002<\/p>\n SGLang \u63d0\u4f9b\u4e86\u7528\u4e8e\u63a7\u5236\u63d0\u793a\u72b6\u6001\u3001\u751f\u6210\u548c\u5e76\u884c\u6027\u7684\u539f\u8bed\uff0c\u53ef\u4e0e Python \u8bed\u6cd5\u548c\u5e93\u914d\u5408\u4f7f\u7528\u3002 \u4ee5\u4e0b\u662f\u8fd9\u4e9b\u539f\u8bed\uff1a<\/p>\n gen:<\/strong>\u8c03\u7528\u4e00\u4e2a\u6a21\u578b\u6765\u751f\u6210\uff0c\u5e76\u5c06\u7ed3\u679c\u5b58\u50a8\u5728\u4e00\u4e2a\u53d8\u91cf\u4e2d\uff0c\u53d8\u91cf\u540d\u5728\u7b2c\u4e00\u4e2a\u53c2\u6570\u4e2d\u6307\u5b9a\u3002 \u5b83\u652f\u6301\u4e00\u4e2a `regex` \u53c2\u6570\uff0c\u7528\u4e8e\u9650\u5236\u8f93\u51fa\u9075\u5faa\u6b63\u5219\u8868\u8fbe\u5f0f\uff08\u4f8b\u5982 JSON \u6a21\u5f0f\uff09\u5b9a\u4e49\u7684\u8bed\u6cd5\u3002<\/p>\n \u6267\u884c SGLang \u7a0b\u5e8f\u7684\u6700\u7b80\u5355\u65b9\u6cd5\u662f\u901a\u8fc7\u89e3\u91ca\u5668\uff0c\u5728\u89e3\u91ca\u5668\u4e2d\uff0c\u63d0\u793a\u7b26\u88ab\u89c6\u4e3a\u5f02\u6b65\u6d41\u3002 \u50cfextend, gen <\/strong>\u548c<\/strong> select<\/strong>\u8fd9\u6837\u7684\u539f\u8bed\u88ab\u63d0\u4ea4\u5230\u6d41\u4e2d\u8fdb\u884c\u5f02\u6b65\u6267\u884c\u3002 \u8fd9\u4e9b\u975e\u963b\u585e\u8c03\u7528\u5141\u8bb8 Python \u4ee3\u7801\u5728\u4e0d\u7b49\u5f85\u751f\u6210\u5b8c\u6210\u7684\u60c5\u51b5\u4e0b\u7ee7\u7eed\u8fd0\u884c\uff0c\u7c7b\u4f3c\u4e8e\u5f02\u6b65\u542f\u52a8 CUDA \u5185\u6838\u3002 \u6bcf\u4e2a\u63d0\u793a\u90fd\u7531\u540e\u53f0\u7ebf\u7a0b\u4e2d\u7684\u6d41\u6267\u884c\u5668\u7ba1\u7406\uff0c\u4ece\u800c\u5b9e\u73b0\u4e86\u7a0b\u5e8f\u5185\u90e8\u7684\u5e76\u884c\u6027\u3002 \u5728\u751f\u6210\u7ed3\u679c\u51c6\u5907\u5c31\u7eea\u4e4b\u524d\uff0c\u83b7\u53d6\u751f\u6210\u7ed3\u679c\u7684\u8fc7\u7a0b\u5c06\u88ab\u963b\u585e\uff0c\u4ee5\u786e\u4fdd\u6b63\u786e\u7684\u540c\u6b65\u3002 \u53e6\u5916\uff0cSGLang \u7a0b\u5e8f\u4e5f\u53ef\u4ee5\u7f16\u8bd1\u4e3a\u8ba1\u7b97\u56fe\uff0c\u5e76\u4f7f\u7528\u56fe\u6267\u884c\u5668\u6267\u884c\uff0c\u4ece\u800c\u5b9e\u73b0\u66f4\u591a\u4f18\u5316\u3002 \u672c\u6587\u9ed8\u8ba4\u4f7f\u7528\u89e3\u91ca\u5668\u6a21\u5f0f\uff0c\u5e76\u5728\u9644\u5f55 D \u4e2d\u8ba8\u8bba\u7f16\u8bd1\u5668\u6a21\u5f0f\u7684\u7ed3\u679c\u3002 SGLang \u901a\u8fc7\u81ea\u5df1\u7684 SGLang Runtime (SRT) \u652f\u6301\u5f00\u653e\u91cd\u91cf\u6a21\u578b\uff0c\u4e5f\u652f\u6301 API \u6a21\u578b\uff0c\u5982\u00a0OpenAI<\/a>\u00a0\u548c\u4eba\u7c7b\u5b66\u6a21\u578b\u3002<\/p>\n LLMs \u7684\u7f16\u7a0b\u7cfb\u7edf\u53ef\u5206\u4e3a\u9ad8\u7ea7\uff08\u5982 LangChain\u3001DSPy\uff09\u548c\u4f4e\u7ea7\uff08\u5982 LMQL\u3001Guidance\u3001SGLang\uff09\u3002 \u9ad8\u7ea7\u7cfb\u7edf\u63d0\u4f9b\u9884\u5b9a\u4e49\u6216\u81ea\u52a8\u751f\u6210\u7684\u63d0\u793a\uff0c\u5982 DSPy \u7684\u63d0\u793a\u4f18\u5316\u5668\u3002 \u4f4e\u7ea7\u7cfb\u7edf\u901a\u5e38\u4e0d\u6539\u53d8\u63d0\u793a\u8bed\uff0c\u4f46\u5141\u8bb8\u76f4\u63a5\u64cd\u4f5c\u63d0\u793a\u8bed\u548c\u57fa\u5143\u3002 SGLang \u662f\u4e00\u79cd\u7c7b\u4f3c\u4e8e LMQL \u548c Guidance \u7684\u4f4e\u7ea7\u7cfb\u7edf\u3002 \u4e0b\u8868\u6bd4\u8f83\u4e86\u5b83\u4eec\u7684\u7279\u70b9\u3002<\/p>\n SGLang \u66f4\u6ce8\u91cd\u8fd0\u884c\u65f6\u7684\u6548\u7387\uff0c\u5b83\u6709\u81ea\u5df1\u5171\u540c\u8bbe\u8ba1\u7684\u8fd0\u884c\u65f6\uff0c\u53ef\u4ee5\u8fdb\u884c\u65b0\u9896\u7684\u4f18\u5316\u3002 \u9ad8\u7ea7\u8bed\u8a00\uff08\u5982 DSPy\uff09\u53ef\u7f16\u8bd1\u4e3a\u4f4e\u7ea7\u8bed\u8a00\uff08\u5982 SGLang\uff09\u3002 \u7a0d\u540e\u5c06\u6f14\u793a\u5982\u4f55\u5c06 SGLang \u4f5c\u4e3a\u540e\u7aef\u96c6\u6210\u5230 DSPy \u4e2d\uff0c\u4ee5\u63d0\u9ad8\u8fd0\u884c\u6548\u7387\u3002<\/p>\n \u4e0a\u8ff0\u793a\u4f8b\u5c55\u793a\u4e86\u91c7\u7528 LRU \u9a71\u9010\u7b56\u7565\u7684 RadixAttention \u5728\u4e5d\u4e2a\u65f6\u95f4\u70b9\u4e0a\u7684\u8fd0\u884c\u60c5\u51b5\uff0c\u5c55\u793a\u4e86\u5f27\u5ea6\u6811\u5728\u54cd\u5e94\u5404\u79cd\u8bf7\u6c42\u65f6\u7684\u52a8\u6001\u6f14\u5316\u3002 \u8fd9\u4e9b\u8bf7\u6c42\u5305\u62ec\u4e24\u4e2a\u804a\u5929\u4f1a\u8bdd\u3001\u4e00\u6279\u5c11\u91cf\u5b66\u4e60\u67e5\u8be2\u548c\u81ea\u4e00\u81f4\u6027\u91c7\u6837\u3002 \u6bcf\u6761\u6811\u8fb9\u90fd\u5e26\u6709\u4e00\u4e2a\u6807\u7b7e\uff0c\u8868\u793a\u4e00\u4e2a\u5b50\u4e32\u6216\u4e00\u4e2a\u6807\u8bb0\u5e8f\u5217\u3002 \u8282\u70b9\u7528\u989c\u8272\u7f16\u7801\u4ee5\u53cd\u6620\u4e0d\u540c\u7684\u72b6\u6001\uff1a\u7eff\u8272\u4ee3\u8868\u65b0\u6dfb\u52a0\u7684\u8282\u70b9\uff0c\u84dd\u8272\u4ee3\u8868\u5728\u65f6\u95f4\u70b9\u671f\u95f4\u8bbf\u95ee\u8fc7\u7684\u7f13\u5b58\u8282\u70b9\uff0c\u7ea2\u8272\u4ee3\u8868\u5df2\u88ab\u9a71\u9010\u7684\u8282\u70b9\u3002<\/p>\n \u6b65\u9aa4<\/strong> 1<\/strong>\uff1a<\/strong>\u534a\u5f84\u6811\u6700\u521d\u4e3a\u7a7a\u3002<\/p>\n \u6b65\u9aa4<\/strong> 2<\/strong>\uff1a<\/strong>\u670d\u52a1\u5668\u5904\u7406\u4f20\u5165\u7684\u7528\u6237\u4fe1\u606f “Hello”\uff0c\u5e76\u4ee5 LLM \u8f93\u51fa “Hi “\u4f5c\u4e3a\u56de\u590d\u3002 \u7cfb\u7edf\u63d0\u793a “\u60a8\u662f\u4e00\u4f4d\u4e50\u4e8e\u52a9\u4eba\u7684\u52a9\u624b”\u3001\u7528\u6237\u4fe1\u606f “\u60a8\u597d\uff01”\u548c LLM \u56de\u590d “\u60a8\u597d\uff01”\u88ab\u5408\u5e76\u5230\u6811\u4e2d\uff0c\u6210\u4e3a\u4e00\u6761\u4e0e\u65b0\u8282\u70b9\u76f8\u8fde\u7684\u8fb9\u3002<\/p>\n \u6b65\u9aa4<\/strong> 3<\/strong>\uff1a<\/strong>\u4e00\u4e2a\u65b0\u7684\u63d0\u793a\u5230\u8fbe\uff0c\u670d\u52a1\u5668\u5728\u5f27\u5ea6\u6811\u4e2d\u627e\u5230\u8be5\u63d0\u793a\u7684\u524d\u7f00\uff08\u5373\u5bf9\u8bdd\u7684\u7b2c\u4e00\u8f6e\uff09\uff0c\u5e76\u91cd\u65b0\u4f7f\u7528\u5176 KV \u7f13\u5b58\u3002 \u65b0\u7684\u4e00\u8f6e\u4f5c\u4e3a\u4e00\u4e2a\u65b0\u8282\u70b9\u9644\u52a0\u5230\u6811\u4e2d\u3002<\/p>\n \u6b65\u9aa4<\/strong> 4<\/strong>\uff1a<\/strong>\u5f00\u59cb\u65b0\u7684\u804a\u5929\u4f1a\u8bdd\u3002 \u6b65\u9aa4 3 \u4e2d\u7684\u8282\u70b9\u88ab\u5206\u6210\u4e24\u4e2a\u8282\u70b9\uff0c\u4ee5\u4fbf\u4e24\u4e2a\u804a\u5929\u4f1a\u8bdd\u5171\u4eab\u7cfb\u7edf\u63d0\u793a\u3002<\/p>\n \u6b65\u9aa4<\/strong> 5<\/strong>\uff1a<\/strong>\u7b2c\u4e8c\u4e2a\u804a\u5929\u4f1a\u8bdd\u7ee7\u7eed\u8fdb\u884c\u3002 \u4f46\u662f\uff0c\u7531\u4e8e\u5185\u5b58\u9650\u5236\uff0c\u5fc5\u987b\u5220\u9664\u6b65\u9aa4 4 \u4e2d\u7684\u4e00\u4e2a\u8282\u70b9\u3002 \u65b0\u7684\u8f6c\u6298\u70b9\u88ab\u6dfb\u52a0\u5230\u6b65\u9aa4 4 \u7684\u5269\u4f59\u8282\u70b9\u4e4b\u540e\u3002<\/p>\n \u6b65\u9aa4<\/strong> 6<\/strong>\uff1a<\/strong>\u670d\u52a1\u5668\u63a5\u6536\u4e00\u4e2a\u5c11\u91cf\u5b66\u4e60\u67e5\u8be2\uff0c\u5bf9\u5176\u8fdb\u884c\u5904\u7406\u5e76\u5c06\u5176\u63d2\u5165\u6811\u4e2d\u3002 \u6839\u8282\u70b9\u88ab\u62c6\u5206\uff0c\u56e0\u4e3a\u65b0\u67e5\u8be2\u4e0e\u73b0\u6709\u8282\u70b9\u4e0d\u5171\u4eab\u4efb\u4f55\u524d\u7f00\u3002<\/p>\n \u6b65\u9aa4<\/strong> 7<\/strong>\uff1a<\/strong>\u670d\u52a1\u5668\u6536\u5230\u4e00\u6279\u989d\u5916\u7684\u5c11\u91cf\u5b66\u4e60\u67e5\u8be2\u3002 \u8fd9\u4e9b\u67e5\u8be2\u5171\u4eab\u540c\u4e00\u7ec4\u5c11\u91cf\u793a\u4f8b\uff0c\u56e0\u6b64\u4f1a\u4ece\u6b65\u9aa4 6 \u4e2d\u62c6\u5206\u4e00\u4e2a\u8282\u70b9\u4ee5\u5b9e\u73b0\u5171\u4eab\u3002<\/p>\n \u6b65\u9aa4<\/strong> 8<\/strong>\uff1a<\/strong>\u670d\u52a1\u5668\u6536\u5230\u6765\u81ea\u7b2c\u4e00\u4e2a\u804a\u5929\u4f1a\u8bdd\u7684\u65b0\u6d88\u606f\u3002 \u5b83\u4f1a\u9a71\u9010\u7b2c\u4e8c\u4e2a\u804a\u5929\u4f1a\u8bdd\u4e2d\u7684\u6240\u6709\u8282\u70b9\uff0c\u56e0\u4e3a\u5b83\u4eec\u662f\u6700\u8fd1\u4f7f\u7528\u6700\u5c11\u7684\u3002<\/p>\n \u6b65\u9aa4<\/strong> 9<\/strong>\uff1a<\/strong>\u670d\u52a1\u5668\u4f1a\u6536\u5230\u4e3a\u6b65\u9aa4 8 \u4e2d\u7684\u8282\u70b9\u4e2d\u7684\u95ee\u9898\u91c7\u6837\u66f4\u591a\u7b54\u6848\u7684\u8bf7\u6c42\uff0c\u8fd9\u53ef\u80fd\u662f\u4e3a\u4e86\u81ea\u6211\u4e00\u81f4\u6027\u63d0\u793a\u3002 \u4e3a\u4e86\u7ed9\u8fd9\u4e9b\u8bf7\u6c42\u817e\u51fa\u7a7a\u95f4\uff0c\u591a\u4e2a\u8282\u70b9\u88ab\u9010\u51fa\u3002<\/p>\n \u672c\u4f8b\u6f14\u793a\u4e86 RadixAttention \u5982\u4f55\u6839\u636e\u4e0d\u540c\u7c7b\u578b\u7684\u8bf7\u6c42\u52a8\u6001\u5206\u914d\u548c\u9a71\u9010\u8282\u70b9\uff0c\u786e\u4fdd\u9ad8\u6548\u7684 KV \u7f13\u5b58\u91cd\u7528\u548c\u5185\u5b58\u7ba1\u7406\u3002<\/p>\n SGLang<\/strong>\uff1a\u8bc4\u4f30\u548c\u7ed3\u679c<\/strong><\/p>\n \u5f00\u653e\u91cd\u91cf\u6a21\u578b\u7684\u7ed3\u679c<\/strong><\/p>\n \u5ef6\u8fdf\u548c\u541e\u5410\u91cf\u7ed3\u679c\u5982\u4e0b\u56fe\u6240\u793a\u3002 SGLang \u5c06\u541e\u5410\u91cf\u63d0\u9ad8\u4e86 6.4 \u500d\uff0c\u5c06\u5ef6\u8fdf\u964d\u4f4e\u4e86 3.7 \u500d\u3002 \u8fd9\u4e9b\u6539\u8fdb\u5f97\u76ca\u4e8e KV \u7f13\u5b58\u7684\u91cd\u590d\u4f7f\u7528\u3001\u5355\u4e2a\u7a0b\u5e8f\u5185\u5e76\u884c\u6027\u7684\u5229\u7528\u4ee5\u53ca\u66f4\u5feb\u7684\u53d7\u9650\u89e3\u7801\u3002<\/p>\n \u5728\u8fd9\u4e9b\u57fa\u51c6\u6d4b\u8bd5\u4e2d\uff0c\u7f13\u5b58\u547d\u4e2d\u7387\u4ece 50% \u5230 99% \u4e0d\u7b49\u3002 \u56fe 13\uff08\u9644\u5f55\uff09\u5217\u51fa\u4e86\u6240\u6709\u57fa\u51c6\u7684\u5df2\u8fbe\u5230\u548c\u6700\u4f73\u7f13\u5b58\u547d\u4e2d\u7387\uff0c\u8868\u660e SGLang \u7684\u7f13\u5b58\u611f\u77e5\u8c03\u5ea6\u5e73\u5747\u63a5\u8fd1 96% \u7684\u6700\u4f73\u547d\u4e2d\u7387\u3002<\/p>\n \u5728\u5177\u6709\u5f20\u91cf\u5e76\u884c\u6027\u7684\u5927\u578b\u6a21\u578b\u4e0a\u7684\u7ed3\u679c<\/strong><\/p>\n \u5728\u540c\u4e00\u7ec4\u57fa\u51c6\u6d4b\u8bd5\u4e2d\uff0c\u5bf9\u8f83\u5927\u7684\u6a21\u578b Mixtral-8x7B \u548c Llama-70B \u8fdb\u884c\u4e86\u5f20\u91cf\u5e76\u884c\u6d4b\u8bd5\uff0c\u7ed3\u679c\u5982\u4e0b\u56fe\u6240\u793a\u3002 \u5927\u578b\u6a21\u578b\u7684\u63d0\u901f\u8d8b\u52bf\u4e0e\u5c0f\u578b\u6a21\u578b\u7c7b\u4f3c\uff0c\u8fd9\u8868\u660e SGLang \u7684\u4f18\u5316\u5bf9\u5927\u578b\u6a21\u578b\u5177\u6709\u826f\u597d\u7684\u666e\u9002\u6027\u3002 \u7531\u4e8e\u7f3a\u4e4f\u9ad8\u6548\u7684\u5f20\u91cf\u5e76\u884c\u5b9e\u73b0\uff0cGuidance \u548c LMQL \u88ab\u7701\u7565\u3002<\/p>\n \u591a\u6a21\u6001\u6a21\u578b\u7684\u7ed3\u679c<\/strong><\/p>\n SGLang \u672c\u673a\u652f\u6301\u56fe\u50cf\u548c\u89c6\u9891\u57fa\u5143\u7684\u591a\u6a21\u6001\u6a21\u578b\u3002 \u672c\u6587\u7684\u4f18\u5316\u4e0e\u591a\u6a21\u6001\u6a21\u578b\u517c\u5bb9\u3002 \u5bf9\u4e8e RadixAttention\uff0c\u8ba1\u7b97\u8f93\u5165\u56fe\u50cf\u7684\u54c8\u5e0c\u503c\u5e76\u5c06\u5176\u7528\u4f5c radix \u6811\u4e2d\u7684\u5bc6\u94a5\uff0c\u4ece\u800c\u5141\u8bb8\u91cd\u590d\u4f7f\u7528\u6765\u81ea\u540c\u4e00\u56fe\u50cf\u7684 KV \u56fe\u50cf\u6807\u8bb0\u7f13\u5b58\u3002 LLaVA-v1.5-7B \uff08\u56fe\u50cf\uff09\u5728 llava-bench-in-the-wild \u4e0a\u8fd0\u884c\uff0cLLaVA-NeXT-34B\uff08\u89c6\u9891\uff09\u5728 ActivityNet \u4e0a\u8fd0\u884c\u3002 \u7531\u4e8e\u8fd9\u4e9b\u6a21\u578b\u6ca1\u6709\u5f97\u5230\u5176\u4ed6\u57fa\u7ebf\u7cfb\u7edf\u7684\u826f\u597d\u652f\u6301\uff0c\u56e0\u6b64\u5c06\u6a21\u578b\u4f5c\u8005\u5728 Hugging Face Transformers \u4e2d\u7684\u539f\u59cb\u5b9e\u73b0\u4f5c\u4e3a\u57fa\u7ebf\u3002 \u5982\u4e0b\u8868\u6240\u793a\uff0cSGLang \u5728\u8fd9\u4e9b\u57fa\u51c6\u6d4b\u8bd5\u4e2d\u7684\u541e\u5410\u91cf\u6700\u9ad8\u63d0\u9ad8\u4e86 6 \u500d\u3002 \u5728llava-bench-in-the-wild\u4e2d\uff0c\u5904\u7406\u4e86\u5173\u4e8e\u540c\u4e00\u56fe\u50cf\u7684\u591a\u4e2a\u95ee\u9898\uff0cSGLang\u8fd0\u884c\u65f6\u5728\u8fd9\u79cd\u60c5\u51b5\u4e0b\u91cd\u590d\u4f7f\u7528\u4e86KV\u7f13\u5b58\u3002<\/p>\n \u751f\u4ea7\u90e8\u7f72<\/strong><\/p>\n SGLang \u5df2\u90e8\u7f72\u5728 Chatbot Arena \u4e2d\uff0c\u4e3a\u5f00\u653e\u5f0f\u6a21\u578b\u63d0\u4f9b\u670d\u52a1\u3002 \u7531\u4e8e\u67d0\u4e9b\u6a21\u578b\u7684\u6d41\u91cf\u8f83\u4f4e\uff0c\u56e0\u6b64\u6bcf\u4e2a\u6a21\u578b\u53ea\u7531\u4e00\u540d SGLang \u5de5\u4f5c\u8005\u63d0\u4f9b\u670d\u52a1\u3002 \u4e00\u4e2a\u6708\u540e\uff0cLLaVA-Next-34B \u7684 RadixAttention \u7f13\u5b58\u547d\u4e2d\u7387\u4e3a 52.4%\uff0cVicuna-33B \u4e3a 74.1%\u3002 \u7f13\u5b58\u547d\u4e2d\u7387\u6765\u81ea\u5e38\u89c1\u7684\u7cfb\u7edf\u6d88\u606f\u3001\u7ecf\u5e38\u91cd\u590d\u4f7f\u7528\u7684\u793a\u4f8b\u56fe\u50cf\u548c\u591a\u8f6e\u804a\u5929\u8bb0\u5f55\u3002 \u8fd9\u4f7f Vicuna-33B \u7684\u9996\u6b21\u6807\u8bb0\u5ef6\u8fdf\u5e73\u5747\u51cf\u5c11\u4e86 1.7 \u500d\u3002<\/p>\n\n
\n
\n