Way to extract feature importance
Hi all, I was wondering if after training a geneformer model on your own data, there is a way to extract which genes contribute most to prediction, like a SHAP feature importance analysis or something along those lines.
Thank you for your question! If you have fine tuned the model to distinguish particular classes, you could use in silico perturbation to determine which genes contribute most to a particular class by determining which genes’ removal shift it most to the opposing class. Another way you could consider analyzing this is by determining which genes are paid most high attention to, which can be determined by examining the attention weights.
Thank you for the quick response! Sorry if this is a naive question, but how do I extract the attention weights from the model?
Please see this prior discussion: https://huggingface.co/ctheodoris/Geneformer/discussions/221