SheepNav
Let’s Build the GPT Tokenizer: A Complete Guide to Tokenization in LLMs
精选7个月前0 投票

Let’s Build the GPT Tokenizer: A Complete Guide to Tokenization in LLMs

18 months ago, Andrej Karpathy set a challenge : “Can you take my 2h13m tokenizer video and translate the video into the format of a book chapter”. We’ve done it, and the chapter is below, including key pieces of code inlined, and images from the video at key points (hyperlinked to the video timestamp). It’s a great video for learning this key piece of how LLMs work, and this new text version is great too.

延伸阅读

  1. Anthropic 超越 OpenAI,成为全球估值最高 AI 初创公司
  2. Openstatus MCP Health Checker:像真实 AI 客户端一样测试 MCP 服务器,不止是 Ping
  3. Step 3.7 Flash:能看会动的极速智能体模型
查看原文