logo_generator / dev /README.md
boris's picture
fix: typo
7067c27
|
raw
history blame
4 kB

Development Instructions for TPU

Setup

  • Apply to the TRC program for free TPU credits if you're elligible.
  • Follow the Cloud TPU VM User's Guide to set up gcloud.
  • Verify gcloud config list, in particular account, project & zone.
  • Create a TPU VM per the guide and connect to it.

When needing a larger disk:

  • Create a balanced persistent disk (SSD, so pricier than default HDD but much faster): gcloud compute disks create DISK_NAME --size SIZE_IN_GB --type pd-balanced
  • Attach the disk to your instance by adding --data-disk source=REF per "Adding a persistent disk to a TPU VM" guide, eg gcloud alpha compute tpus tpu-vm create INSTANCE_NAME --accelerator-type=v3-8 --version=v2-alpha --data-disk source=projects/tpu-toys/zones/europe-west4-a/disks/DISK_NAME
  • Format the partition as described in the guide.
  • Make sure to set up automatic remount of disk at restart.

Connect VS Code

  • Find external IP in the UI or with gcloud alpha compute tpus tpu-vm describe INSTANCE_NAME

  • Verify you can connect in terminal with ssh EXTERNAL_IP -i ~/.ssh/google_compute_engine

  • Add the same command as ssh host in VS Code.

  • Check config file

    Host INSTANCE_NAME
      HostName EXTERNAL_IP
      IdentityFile ~/.ssh/google_compute_engine
    

Environment configuration

Use virtual environments (optional)

We recommend using virtual environments (such as conda, venv or pyenv-virtualenv).

If you want to use pyenv and pyenv-virtualenv:

  • Installation

    • Set up build environment

    • Use pyenv-installer: curl https://pyenv.run | bash

    • bash set-up:

      echo '\n'\
          '# pyenv setup \n'\
          'export PYENV_ROOT="$HOME/.pyenv" \n'\
          'export PATH="$PYENV_ROOT/bin:$PATH" \n'\
          'eval "$(pyenv init --path)" \n'\
          'eval "$(pyenv init -)" \n'\
          'eval "$(pyenv virtualenv-init -)"' >> ~/.bashrc
      
  • Usage

    • Install a python version: pyenv install X.X.X

    • Create a virtual environment: pyenv virtualenv 3.9.6 dalle_env

    • Activate: pyenv activate dalle_env

      Note: you can auto-activate your environment at a location with echo dalle_env >> .python-version

Tools

  • Git

    • `git config --global user.email "[email protected]"
    • `git config --global user.name "First Last"
  • Github CLI

  • Direnv

    • Install direnv: sudo apt-get update && sudo apt-get install direnv

    • bash set-up:

      echo -e '\n'\
          '# direnv setup \n'\
          'eval "$(direnv hook bash)" \n' >> ~/.bashrc
      

Set up repo

  • Clone repo: gh repo clone borisdayma/dalle-mini
  • If using pyenv-virtualenv, auto-activate env: echo dalle_env >> .python-version

Environment

  • Install the following (use it later to update our dev requirements.txt)
requests
pillow
jupyterlab
ipywidgets

-e ../datasets[streaming]
-e ../transformers
-e ../webdataset

# JAX
--find-links https://storage.googleapis.com/jax-releases/libtpu_releases.html
jax[tpu]>=0.2.16
flax
  • transformers-cli login

  • set HF_HOME="/mnt/disks/persist/cache/huggingface" in /etc/environment and ensure you have required permissions, then restart.

Working with datasets or models

  • Install Git LFS
  • Clone a dataset without large files: GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/datasets/.../...
  • Use a local credential store for caching credentials
  • Track specific extentions: git lfs track "*.ext"
  • See files tracked with LFS with git lfs ls-files