Unable to produce meaningful in silico perturb results
#223
by
Pingiotto
- opened
I am trying to reproduce some of your results using the fine-tuned CM cell classifier model in conjugation with the human_dcm_hcm_nf dataset. I tried to delete ADCY5, SRPK3 and ACTB (independently of each other), all of which gave statistically insignificant results. Any guidance would be greatly appreciated.
isp = InSilicoPerturber(
perturb_type="delete",
perturb_rank_shift=None,
genes_to_perturb=["ENSG00000184343"],
combos=0,
anchor_gene=None,
model_type="CellClassifier",
num_classes=3,
emb_mode="cell",
cell_emb_style="mean_pool",
filter_data={"cell_type":["Cardiomyocyte1","Cardiomyocyte2","Cardiomyocyte3"]},
cell_states_to_model={
'state_key': 'disease',
'start_state': 'dcm',
'goal_state': 'nf',
'alt_states': ['hcm']
},
max_ncells=None,
emb_layer=0,
forward_batch_size=50,
nproc=16,
)
isp.perturb_data(
"./CellClassifier_cardiomyopathies/",
"./human_dcm_hcm_nf.dataset",
"./perturb_out/",
prefix
)
ispstats = InSilicoPerturberStats(mode="goal_state_shift",
genes_perturbed=["ENSG00000184343"],
combos=0,
anchor_gene=None,
cell_states_to_model={"disease":(["dcm"],["nf"],["hcm"])})
ispstats.get_stats("./perturb_out/",
None,
"./perturb_out/",
prefix)
Results produced the cell embedding shifts for each cell expressing the respective gene. Here is a list of the mean values for the Shift_to_goal_end column ± std dev:
ACTB: 0.0013 ± 0.0160
ADCY5: -0.0011 ± 0.0340
SRPK3: -0.00105 ± 0.0246
Thank you for your interest in Geneformer! Please see closed discussion 246:
https://huggingface.co/ctheodoris/Geneformer/discussions/246
ctheodoris
changed discussion status to
closed