Zhengxuan Wu

Aug. 2025 — ongoing

A quieter corner: series posts collected on a separate NOSR page. I expect no one shall read this (NOSR) other than the future of myself. That's why i name it NOSR.

On representation steering

July 12, 2025

Representation steering is a powerful tool for understanding and controlling the behavior of language models. In this post, I will share my lessons learned from using representation steering to understand and control the behavior of language models from our recent work on training a better representation steering method with preference-based training objective.

What is representation finetuning?

April 05, 2024

Representation finetuning (ReFT) represents a novel approach to parameter-efficient, powerful, and interpretable fine-tuning of language models. It draws inspiration from our interpretability work in distributed alignment search (DAS). Instead of training any model weights, we train interventions that edit representations on-the-fly. We demonstrate that editing a very limited number of representations is sufficient to achieve or get close to the state-of-the-art (SoTA) performance across a wide range of tasks.

Scaling interpretability with LLMs

May 09, 2023

Obtaining robust, human-interpretable explanations of large, general-purpose language models is an urgent goal for AI. Building on the theory of causal abstraction, we release this generic library encapsulated Boundless DAS introduced in our paper for find representations that play a given causal role in LLMs with billions of parameters.

CS Ph.D. statement of purpose

June 30, 2022

It takes me years to transition from an aerospace engineering student to a NLP Ph.D. student. I want to share my experience as much as I can, so people can build on top of it to make their experience even better. For my SOP, I have to credit my good friend Nelson F. Liu. I wrote my SOP based on his! I applied twice, and I am also pretty liberal to share the version of my failed attempt. One takeaway for me is that - you need to have a big vision that is grounded with specific past experience.

/blog 📡 RSS