aws-neuron
/

optimum-neuron-cache

Model card Files Files and versions Community

optimum-neuron-cache / inference-cache-config

8 contributors

History: 54 commits

dacorvo's picture

dacorvo HF staff

Add configuration for granite models

687da09 verified 29 days ago

gpt2.json

398 Bytes

Add more gpt2 configurations 10 months ago
granite.json

1.3 kB

Add configuration for granite models 29 days ago
llama-variants.json

559 Bytes

Remove obsolete llama variants 4 months ago
llama.json

1.67 kB

Update inference-cache-config/llama.json 4 months ago
llama2-70b.json

287 Bytes

Create llama2-70b.json 7 months ago
llama3-70b.json

283 Bytes

Update inference-cache-config/llama3-70b.json 4 months ago
llama3.1-70b.json

289 Bytes

Rename inference-cache-config/Llama3.1-70b.json to inference-cache-config/llama3.1-70b.json 4 months ago
mistral-variants.json

1.04 kB

Remove obsolete mistral variants 4 months ago
mistral.json

1.8 kB

Update inference-cache-config/mistral.json 4 months ago
mixtral.json

583 Bytes

Update inference-cache-config/mixtral.json 4 months ago
qwen2.5-large.json

558 Bytes

Rename inference-cache-config/qwen-2.5-large.json to inference-cache-config/qwen2.5-large.json about 2 months ago
qwen2.5.json

1.45 kB

Rename inference-cache-config/qwen2.5 to inference-cache-config/qwen2.5.json about 2 months ago
stable-diffusion.json

1.91 kB

Update inference-cache-config/stable-diffusion.json 4 months ago