• Zhang
  • 🇨🇳 CN
  • Weekly
  • About
  • Categories
Be yourself and don't go with the flow.

集合通信总结和 mpi4py 实践

【2025-06-28】讲解了 Broadcast、Scatter 和 Gather、All-Gather 和 All-Reduce、以及All-to-All 通信原理,并基于 mpi4py 库实践。

AWQ 量化详解

【2024-11-01】先回顾了 SmoothQuant 算法的是三个核心观点,然后开始解读 AWQ 算法的两个核心观点(创新点)::LLM 权重并非同等重要,只有 0.1%~1% 的小部分显著权重对模型输出精度影响较大,又因为幅度较大的输入特征通常更重要,因此需要基于激活分布来挑选权重的显著通道,以及如何基于激活感知缩放保护关键权重。

Overview of vLLM Optimization Techniques

【2024-10-26】An overview of vLLM optimization techniques, introducing PagedAttention and continuous batching solutions.

Improvements to the Legendary Fastest Terminal

【2023-02-06】Recently, I discovered Alacritty, a cross-platform terminal emulator powered by Rust and accelerated with OpenGL, merely around 5MB in size, touted as the fastest terminal. However, it's truly unattractive. I contemplated whether to revamp it to possibly make it my default terminal. The final product turned out quite well.
github Twitter RSS github 2015~2025