Fix issues with loading F8Linear from state dict when init_scale not initialized & loaded from meta device 3ddaa67 aredden commited on Sep 1, 2024
Small fix for issue where f16 CublasLinear layers weren't being used even when available. 6d82dcc aredden commited on Aug 28, 2024
Remove f8 flux, instead configure at load, improved quality & corrected configs 1f9e684 aredden commited on Aug 24, 2024
Dynamic swap with cublas linear / optional improved precision with vram drawback 37bd8c1 aredden commited on Aug 24, 2024
Remove unnecessary code, hide prints behind debug flag, hide warnings 0f3134f aredden commited on Aug 20, 2024
Add fields to configs, fix issue with offload from bnb, remove extra random text code 340f0a0 aredden commited on Aug 19, 2024
Fix non-offload inference & add option to load from prequantized flux 2f2c44c aredden commited on Aug 18, 2024