Jing's picture

1 1

Jing

hij

AI & ML interests

None yet

Recent Activity

liked a Space about 2 months ago

liujch1998/infini-gram

upvoted a collection 3 months ago

TOFU Unlearned Models

authored a paper 7 months ago

Rigorously Assessing Natural Language Explanations of Neurons

View all activity

Organizations

hij's activity

liked a Space about 2 months ago

infini-gram

upvoted a collection 3 months ago

TOFU Unlearned Models

Collection of Phi TOFU models with various configurations • 17 items • Updated Oct 8, 2024 • 4

authored 3 papers 7 months ago

Rigorously Assessing Natural Language Explanations of Neurons

Paper • 2309.10312 • Published Sep 19, 2023

A Reply to Makelov et al. (2023)'s "Interpretability Illusion" Arguments

Paper • 2401.12631 • Published Jan 23, 2024

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

Paper • 2403.07809 • Published Mar 12, 2024 • 1

authored a paper 8 months ago

RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations

Paper • 2402.17700 • Published Feb 27, 2024 • 2