usage of bulk RNA-Seq?
Congratulations on your remarkable job! I have a question about can we use your method to infer gene networks based on traditional bulk RNA-Seq data. If possible, what is the minimum sample size I can use for training the model, and will 1 work?
Thank you for your interest in Geneformer. In the manuscript we demonstrated multiple zero-shot learning applications of the model including determining transcription factor targets. However, if you have additional task-specific data, fine-tuning towards your downstream task would generally improve results. If you are asking whether it's possible to fine-tune the model with bulk RNA-seq data as the input, we have not evaluated that, but it may be possible. The bulk data would be tokenized to the maximum input size of the model.
This is an interesting topic to transfer the model to bulk RNA-seq; our Bulk rnaseq have >60,000 genes including coding and non-coding genes; and our focus is the disease states and other clinical conditions such as the development of acute kidney injury, lung injury and so on. May I consider such disease condition in patient level as cell state as that in the main Nature
paper?
Thank you for your question. Yes, you can consider disease states as a cell state; this is our approach in the Nature paper for the cardiomyopathy therapeutic target prediction.