betterdocs
domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init
action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114jnews-view-counter
domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init
action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114wp-statistics
domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init
action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114wpdiscuz
domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init
action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /data/user/htdocs/wp-includes/functions.php on line 6114jnews
域的翻译加载触发过早。这通常表示插件或主题中的某些代码运行过早。翻译应在 init
操作或之后加载。 请查阅调试 WordPress来获取更多信息。 (这个消息是在 6.7.0 版本添加的。) in /data/user/htdocs/wp-includes/functions.php on line 6114jnews-like
域的翻译加载触发过早。这通常表示插件或主题中的某些代码运行过早。翻译应在 init
操作或之后加载。 请查阅调试 WordPress来获取更多信息。 (这个消息是在 6.7.0 版本添加的。) in /data/user/htdocs/wp-includes/functions.php on line 6114Llama 3.1-405B<\/u><\/a>\u7531 Meta AI \u5f00\u53d1\u7684 Llama 3.1-405B \u4ee3\u8868\u4e86\u5f00\u6e90\u8bed\u8a00\u6a21\u578b\u7684\u91cd\u5927\u98de\u8dc3\u3002 \u5b83\u62e5\u6709 4050 \u4ebf\u4e2a\u53c2\u6570\uff0c\u662f\u8fc4\u4eca\u4e3a\u6b62\u6700\u5927\u7684\u516c\u5f00\u53ef\u7528\u8bed\u8a00\u6a21\u578b\uff0c\u5728\u5404\u79cd\u57fa\u51c6\u6d4b\u8bd5\u4e2d\u53ef\u4e0e\u4e00\u4e9b\u6700\u5148\u8fdb\u7684\u4e13\u6709\u6a21\u578b\u76f8\u5ab2\u7f8e\uff0c\u751a\u81f3\u8d85\u8d8a\u5b83\u4eec\u3002<\/p>\n \u5728\u5f00\u6e90\u9886\u57df\u53d1\u5e03\u8fd9\u6837\u4e00\u4e2a\u529f\u80fd\u5f3a\u5927\u7684\u6a21\u578b\u6539\u53d8\u4e86\u6e38\u620f\u89c4\u5219\uff0c\u4f7f\u6700\u5148\u8fdb\u7684\u4eba\u5de5\u667a\u80fd\u529f\u80fd\u7684\u83b7\u53d6\u53d8\u5f97\u66f4\u52a0\u6c11\u4e3b\uff0c\u5e76\u4fc3\u8fdb\u4e86\u6574\u4e2a\u884c\u4e1a\u7684\u521b\u65b0\u3002<\/p>\n \u8fd9\u4e00\u8fc7\u7a0b\u4ece\u5c06\u8f93\u5165\u6587\u672c\u6807\u8bb0\u8f6c\u6362\u4e3a\u6807\u8bb0\u5d4c\u5165\u5f00\u59cb\u3002 \u8fd9\u4e9b\u5d4c\u5165\u7b26\u901a\u8fc7\u591a\u5c42\u81ea\u6ce8\u610f\u548c\u524d\u9988\u7f51\u7edc\uff0c\u4f7f\u6a21\u578b\u80fd\u591f\u6355\u6349\u6587\u672c\u4e2d\u7684\u590d\u6742\u5173\u7cfb\u548c\u4f9d\u8d56\u6027\u3002 \u7136\u540e\uff0c\u81ea\u56de\u5f52\u89e3\u7801\u673a\u5236\u751f\u6210\u8f93\u51fa\u6587\u672c\u6807\u8bb0\uff0c\u5b8c\u6210\u6574\u4e2a\u8fc7\u7a0b\u3002<\/p>\n \u5206\u7ec4\u67e5\u8be2\u6ce8\u610f<\/p>\n Llama 3.1 \u91c7\u7528\u4e86\u5206\u7ec4\u67e5\u8be2\u6ce8\u610f\u6280\u672f\uff0c\u8fd9\u662f\u4e00\u9879\u91cd\u8981\u7684\u4f18\u5316\u6280\u672f\uff0c\u4f46\u5728\u524d\u9762\u7684\u56de\u7b54\u4e2d\u5e76\u672a\u5b8c\u5168\u6d89\u53ca\u3002 \u8ba9\u6211\u4eec\u6765\u8be6\u7ec6\u63a2\u8ba8\u4e00\u4e0b\uff1a<\/p>\n \u5206\u7ec4\u67e5\u8be2\u6ce8\u610f\u529b\uff08GQA\uff09\u662f\u591a\u5934\u6ce8\u610f\u529b\u7684\u4e00\u79cd\u53d8\u4f53\uff0c\u65e8\u5728\u51cf\u5c11\u63a8\u7406\u8fc7\u7a0b\u4e2d\u7684\u8ba1\u7b97\u6210\u672c\u548c\u5185\u5b58\u4f7f\u7528\uff0c\u5c24\u5176\u662f\u5bf9\u4e8e\u957f\u5e8f\u5217\u3002 \u5728 Llama 3.1 405B \u6a21\u578b\u4e2d\uff0cGQA \u662f\u901a\u8fc7 8 \u4e2a\u952e\u503c\u5934\u5b9e\u73b0\u7684\u3002<\/p>\n \u4ee5\u4e0b\u662f GQA \u7684\u5de5\u4f5c\u539f\u7406\uff1a<\/p>\n \u6ce8\u610f\u529b\uff08Q\u3001K\u3001V\uff09= softmax(QK^T \/ sqrt(d_k))V<\/p>\n \u5176\u4e2d Q \u88ab\u5206\u4e3a g \u7ec4\uff0cK \u548c V \u7684\u5934\u6570\u5c11\u4e8e Q\u3002<\/p>\n Llama 3.1 405B \u4e2d GQA \u7684\u4f18\u70b9\u5305\u62ec<\/p>\n \u6587\u7ae0\u63d0\u5230\u4e86\u5b9e\u73b0 128K \u6807\u8bb0\u4e0a\u4e0b\u6587\u7a97\u53e3\u7684\u4e24\u9636\u6bb5\u9884\u8bad\u7ec3\u8fc7\u7a0b\u3002 \u8fd9\u662f Llama 3.1 405B \u80fd\u529b\u7684\u4e00\u4e2a\u91cd\u8981\u65b9\u9762\uff1a<\/p>\n \u7b2c 1 \u9636\u6bb5\uff1a\u5bf9 8K \u4e2a\u8bcd\u7ec4\u8fdb\u884c\u521d\u59cb\u9884\u8bad\u7ec3<\/b><\/strong><\/p>\n \u7b2c\u4e8c\u9636\u6bb5\uff1a\u4e3a\u6269\u5c55\u8bed\u5883\u7ee7\u7eed\u8fdb\u884c\u9884\u8bad\u7ec3<\/b><\/strong><\/p>\n \u867d\u7136\u524d\u9762\u7684\u56de\u7b54\u6d89\u53ca\u4e86\u591a\u6a21\u6001\u529f\u80fd\uff0c\u4f46\u6211\u4eec\u53ef\u4ee5\u8fdb\u4e00\u6b65\u8bf4\u660e Llama 3.1 405B \u662f\u5982\u4f55\u5b9e\u73b0\u591a\u6a21\u6001\u529f\u80fd\u7684\uff1a<\/p>\n \u5408\u6210\u65b9\u6cd5\uff1a<\/b><\/strong><\/p>\n \u4e0e\u8bed\u8a00\u6a21\u578b\u6574\u5408\uff1a<\/b><\/strong><\/p>\n \u4ea4\u53c9\u6ce8\u610f\u673a\u5236\uff1a<\/b><\/strong><\/p>\n Llama 3.1 405B \u7684\u591a\u6a21\u6001\u529f\u80fd\u5f00\u8f9f\u4e86\u5e7f\u6cdb\u7684\u5e94\u7528\u9886\u57df\uff0c\u4f8b\u5982<\/p>\n \u7ecf\u8fc7\u6559\u5b66\u8c03\u6574\u7684\u7248\u672c\u7ecf\u8fc7\u4e86\u989d\u5916\u7684\u57f9\u8bad\uff1a<\/p>\n \u4e0b\u8868\u6bd4\u8f83\u4e86 Llama 3.1 405B\u3001Nemotron 4 340B Instruct\u3001GPT-4 (0125)\u3001GPT-4 Omni \u548c Claude 3.5 Sonnet\u3002 \u4e3b\u8981\u57fa\u51c6\u5305\u62ec MMLU \u548c IFEval \u7b49\u4e00\u822c\u4efb\u52a1\u3001HumanEval \u548c GSM8K \u7b49\u4ee3\u7801\u4efb\u52a1\u4ee5\u53ca ARC Challenge \u7b49\u63a8\u7406\u4efb\u52a1\u3002 \u6bcf\u4e2a\u57fa\u51c6\u5f97\u5206\u90fd\u53cd\u6620\u4e86\u6a21\u578b\u5728\u7406\u89e3\u548c\u751f\u6210\u7c7b\u4eba\u6587\u672c\u3001\u89e3\u51b3\u590d\u6742\u95ee\u9898\u548c\u6267\u884c\u4ee3\u7801\u65b9\u9762\u7684\u80fd\u529b\u3002 \u503c\u5f97\u6ce8\u610f\u7684\u662f\uff0cLlama 3.1 405B \u548c Claude 3.5 Sonnet \u5728\u591a\u4e2a\u57fa\u51c6\u6d4b\u8bd5\u4e2d\u8868\u73b0\u51fa\u8272\uff0c\u5c55\u793a\u4e86\u5b83\u4eec\u5728\u4e00\u822c\u4efb\u52a1\u548c\u7279\u5b9a\u9886\u57df\u4efb\u52a1\u4e2d\u7684\u5148\u8fdb\u80fd\u529b\u3002<\/p>\n \u8fd0\u884c Llama 3.1-405B \u9700\u8981\u5927\u91cf\u5185\u5b58\u548c\u8ba1\u7b97\u8d44\u6e90\uff1a<\/p>\n \u6709\u6548\u8fd0\u884c Llama 3.1 \u8fd9\u6837\u7684 405B \u53c2\u6570\u6a21\u578b\u9700\u8981\u591a\u79cd\u4f18\u5316\u6280\u672f\u3002 \u4ee5\u4e0b\u662f\u786e\u4fdd\u6709\u6548\u63a8\u65ad\u7684\u5173\u952e\u65b9\u6cd5\uff1a<\/p>\n b\uff09\u5f20\u91cf\u5e76\u884c\uff1a<\/b><\/strong>\u5f20\u91cf\u5e76\u884c\u6d89\u53ca\u5728\u591a\u4e2a GPU \u4e0a\u5206\u5272\u6a21\u578b\u5c42\u4ee5\u5e76\u884c\u8ba1\u7b97\u3002 \u8fd9\u5bf9\u4e8e\u50cf Llama 3.1 \u8fd9\u6837\u7684\u5927\u578b\u6a21\u578b\u5c24\u5176\u6709\u7528\uff0c\u53ef\u4ee5\u6709\u6548\u5229\u7528\u8d44\u6e90\u3002<\/p>\n \u90e8\u7f72 Llama 3.1-405B \u9700\u8981\u4ed4\u7ec6\u8003\u8651\u786c\u4ef6\u8d44\u6e90\u3002 \u4ee5\u4e0b\u662f\u4e00\u4e9b\u9009\u9879\uff1a<\/p>\n b\uff09\u5185\u90e8\u90e8\u7f72\uff1a<\/b><\/strong>\u5bf9\u4e8e\u5177\u6709\u9ad8\u6027\u80fd\u8ba1\u7b97\u80fd\u529b\u7684\u7ec4\u7ec7\uff0c\u5728\u5185\u90e8\u90e8\u7f72 Llama 3.1 \u53ef\u63d0\u4f9b\u66f4\u591a\u63a7\u5236\uff0c\u5e76\u53ef\u80fd\u964d\u4f4e\u957f\u671f\u6210\u672c\u3002<\/p>\n \u793a\u4f8b\u8bbe\u7f6e\uff1a<\/b><\/strong><\/p>\n Llama 3.1-405B \u7684\u5f3a\u5927\u529f\u80fd\u548c\u7075\u6d3b\u6027\u5e26\u6765\u4e86\u65e0\u6570\u53ef\u80fd\u6027\uff1a<\/p>\n b\uff09\u77e5\u8bc6\u63d0\u70bc\uff1a<\/b><\/strong>\u5c06 405B \u6a21\u578b\u7684\u77e5\u8bc6\u8f6c\u79fb\u5230\u66f4\u5c0f\u3001\u66f4\u6613\u4e8e\u90e8\u7f72\u7684\u6a21\u578b\u4e2d\u3002<\/p>\n \u8fd9\u4e9b\u6280\u672f\u548c\u7b56\u7565\u5c06\u5e2e\u52a9\u60a8\u5145\u5206\u53d1\u6325 Llama 3.1-405B \u7684\u6f5c\u529b\uff0c\u786e\u4fdd\u9ad8\u6548\u3001\u53ef\u6269\u5c55\u548c\u4e13\u4e1a\u5316\u7684\u4eba\u5de5\u667a\u80fd\u5e94\u7528\u3002<\/p>\n Llama 3.1-405B \u7684\u53d1\u5e03\u53ef\u80fd\u4f1a\u52a0\u901f\u591a\u4e2a\u9886\u57df\u7684\u521b\u65b0\uff1a<\/p>\n\u4e3b\u8981\u529f\u80fd<\/b><\/strong><\/h3>\n
\n
\u6a21\u578b\u67b6\u6784\u548c\u8bad\u7ec3<\/b><\/strong><\/h2>\n
1.\u00a0<\/b>\u5206\u7ec4\u67e5\u8be2\u6ce8\u610f\u529b (GQA)<\/b><\/strong><\/h3>\n
\n
\n
1.\u00a0<\/b>\u6269\u5c55\u8bed\u5883\u7684\u4e24\u9636\u6bb5\u9884\u8bad\u7ec3<\/b><\/strong><\/h3>\n
\n
\n
1.\u00a0<\/b>\u591a\u6a21\u6001\u80fd\u529b<\/b><\/strong><\/h3>\n
\n
\n
\n
\n
\u8bad\u7ec3\u8be6\u60c5<\/b><\/strong><\/h3>\n
\n
\n
\u6027\u80fd\u57fa\u51c6<\/b><\/strong><\/h2>\n
Llama 3.1-405B \u7684\u5185\u5b58\u8981\u6c42<\/b><\/strong><\/h3>\n
\n
Llama 3.1-405B \u7684\u63a8\u7406\u4f18\u5316\u6280\u672f<\/b><\/strong><\/h3>\n
\n
\n
\u90e8\u7f72\u7b56\u7565<\/b><\/strong><\/h3>\n
\n
\n
\u4f7f\u7528\u6848\u4f8b\u548c\u5e94\u7528<\/b><\/strong><\/h3>\n
\n
\n
\u672a\u6765\u65b9\u5411<\/b><\/strong><\/h3>\n