CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection Paper • 2508.12535 • Published Aug 18, 2025 • 2 • 2
FaithfulSAE: Towards Capturing Faithful Features with Sparse Autoencoders without External Dataset Dependencies Paper • 2506.17673 • Published Jun 21, 2025 • 7 • 1