<feed xmlns="http://www.w3.org/2005/Atom"> <id>/</id><title>Chirpy</title><subtitle>A minimal, responsive and feature-rich Jekyll theme for technical writing.</subtitle> <updated>2026-04-29T21:33:30+08:00</updated> <author> <name>刘鑫</name> <uri>/</uri> </author><link rel="self" type="application/atom+xml" href="/feed.xml"/><link rel="alternate" type="text/html" hreflang="en" href="/"/> <generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator> <rights> © 2026 刘鑫 </rights> <icon>/assets/img/favicons/favicon.ico</icon> <logo>/assets/img/favicons/favicon-96x96.png</logo> <entry><title>Triton Fused Softmax Kernel</title><link href="/posts/Triton-Fused-Softmax/" rel="alternate" type="text/html" title="Triton Fused Softmax Kernel" /><published>2026-04-28T00:23:14+08:00</published> <updated>2026-04-28T00:23:14+08:00</updated> <id>/posts/Triton-Fused-Softmax/</id> <content type="text/html" src="/posts/Triton-Fused-Softmax/" /> <author> <name>peter_lau</name> </author> <category term="AI" /> <summary>本文代码来自triton-02-fused-softmax。 Triton的样本代码实现基于假设：**每一行数据可以完整的放入GPU的shared_memory中** kernel外围驱动代码 样本代码 properties = driver.active.utils.get_device_properties(DEVICE.index) NUM_SM = properties["multiprocessor_count"] NUM_REGS = properties["max_num_regs"] SIZE_SMEM = properties["max_shared_mem"] WARP_SIZE = properties["warpSize"] target = triton.runtime.driver.active.get_current_target() kerne...</summary> </entry> <entry><title>Agent Harness Engineering</title><link href="/posts/Agent-Harness-Enineering/" rel="alternate" type="text/html" title="Agent Harness Engineering" /><published>2026-04-15T00:00:00+08:00</published> <updated>2026-04-15T00:00:00+08:00</updated> <id>/posts/Agent-Harness-Enineering/</id> <content type="text/html" src="/posts/Agent-Harness-Enineering/" /> <author> <name>peter_lau</name> </author> <category term="AI" /> <summary>本文总结自learn-harness-engineering 🔥标记为重点章节 🙋‍♂️为个人理解部分 模型能力强不等于执行可靠 Anthropic 做过一个对照实验。同一个 prompt（”做一个 2D 复古游戏编辑器”），同一个模型（Opus 4.5）。第一次让它裸跑20分钟，花了9美元，游戏核心功能根本跑不起来。第二次给它配上完整的harness(planner + generator + evaluator)，6 小时，花了200美元，游戏可以正常游玩。 模型没换。Opus 4.5 还是那个 Opus 4.5。换的是马鞍。 OpenAI 在 2025 年发布的 harness engineering 文章里说得更直白：Codex 在一个 harness 搭得好的仓库里，表现能从”不可靠”变成”可靠”。注意他们的用词——不是”好了一点”，是质变。就像一匹千里马，没马...</summary> </entry> <entry><title>开篇词：AI系统性能工程</title><link href="/posts/AI_system_performance_engineering_one/" rel="alternate" type="text/html" title="开篇词：AI系统性能工程" /><published>2026-02-08T00:00:00+08:00</published> <updated>2026-02-08T00:00:00+08:00</updated> <id>/posts/AI_system_performance_engineering_one/</id> <content type="text/html" src="/posts/AI_system_performance_engineering_one/" /> <author> <name>peter_lau</name> </author> <category term="AI" /> <summary>AI系统性能工程读书笔记</summary> </entry> <entry><title>风流宰相谢安</title><link href="/posts/%E8%B0%A2%E5%AE%89/" rel="alternate" type="text/html" title="风流宰相谢安" /><published>2026-01-28T00:00:00+08:00</published> <updated>2026-01-28T00:00:00+08:00</updated> <id>/posts/%E8%B0%A2%E5%AE%89/</id> <content type="text/html" src="/posts/%E8%B0%A2%E5%AE%89/" /> <author> <name>peter_lau</name> </author> <category term="wemedia" /> <category term="history" /> <summary>两晋十六国人物志</summary> </entry> <entry><title>一文速览ViT至Qwen3-VL的演变</title><link href="/posts/vit_evolution/" rel="alternate" type="text/html" title="一文速览ViT至Qwen3-VL的演变" /><published>2026-01-28T00:00:00+08:00</published> <updated>2026-01-28T00:00:00+08:00</updated> <id>/posts/vit_evolution/</id> <content type="text/html" src="/posts/vit_evolution/" /> <author> <name>peter_lau</name> </author> <category term="AI" /> <summary>本文正在撰写中，敬请期待。</summary> </entry> </feed>
