--- license: apache-2.0 datasets: - thegoodfellas/mc4-pt-cleaned language: - pt inference: false metrics: - bleu library_name: transformers pipeline_tag: text2text-generation --- # Model Card for Model ID This is the PT-BR Flan-T5-base model. Forked from: https://huggingface.co/thegoodfellas/tgf-flan-t5-base-ptbr # Model Details ## Model Description This model was created to act as the base study for researchs who wants to learn how the Flan-T5 works. This is the Portuguese version. - **Developed by:** The Good Fellas team - **Model type:** Flan-T5 - **Language(s) (NLP):** Portuguese (BR) - **License:** apache-2.0 - **Finetuned from model [optional]:** Flan-T5-base We would like to thanks the TPU Research Cloud team for that amazing opportunity given to us. To learn about TRC: https://sites.research.google/trc/about/ # Uses This model can be used as base to downstream task as instructed by Flan-T5 paper # Bias, Risks, and Limitations Due to the nature of the web-scraped corpus on which Flan-T5 models were trained, it is likely that their usage could reproduce and amplify pre-existing biases in the data, resulting in potentially harmful content such as racial or gender stereotypes and conspiracist views. For this reason, the study of such biases is explicitly encouraged, and model usage should ideally be restricted to research-oriented and non-user-facing endeavors. ## How to Get Started with the Model Use the code below to get started with the model. ``` from transformers import FlaxT5ForConditionalGeneration model_flax = FlaxT5ForConditionalGeneration.from_pretrained("thegoodfellas/tgf-flan-t5-base-ptbr") ``` # Training Details ## Training Data The training was performed from two datasets, BrWac and Oscar (Portuguese section). ## Training Procedure We trained this model by 1 epoch on each dataset. ### Training Hyperparameters Thanks to TPU Research Cloud we were able to train this model on TPU. 1 single TPUv2-8 - **Training regime:** - Precision: bf16 - Batch size: 32 - LR: 0,005 - Warmup steps: 10_000 - Epochs: 1 (each dataset) - Optimizer: Adafactor # Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). Experiments were conducted using Google Cloud Platform in region us-central1, which has a carbon efficiency of 0.57 kgCO$_2$eq/kWh. A cumulative of 50 hours of computation was performed on hardware of type TPUv2 Chip (TDP of 221W). Total emissions are estimated to be 6.3 kgCO$_2$eq of which 100 percents were directly offset by the cloud provider. - **Hardware Type:** TPUv2 - **Hours used:** 50 - **Cloud Provider:** GCP - **Compute Region:** us-central1 - **Carbon Emitted:** 6.3 kgCO$_2$eq # Technical Specifications [optional] ## Model Architecture and Objective Flan-T5