Huggingface cli 命令行工具使用

HuggingFace_Hub Python 软件包带有一个名为 huggingface-cli 的内置 CLI，支持用于通过命令行与 Hugging Face Hub (Hugging Face 模型和数据集的中心仓库) 进行交互

介绍

huggingface-cli 支持：

Download files from the Hub
Upload files to the Hub
Manage your repositories
Run Inference on deployed models
Search for models, datasets and Spaces
Share Model Cards to document your models
Engage with the community through PRs and comments

安装

    
    bash
  
pip install -U "huggingface_hub[cli]" -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com

pip install -U "huggingface_hub[cli]"

# 升级
pip install --upgrade huggingface_hub

help

huggingface-cli--help ...

$ huggingface-cli --help
usage: huggingface-cli <command> [<args>]

positional arguments:
  {download,upload,repo-files,env,login,whoami,logout,auth,repo,lfs-enable-largefiles,lfs-multipart-upload,scan-cache,delete-cache,tag,version,upload-large-folder}
                        huggingface-cli command helpers
    download            Download files from the Hub
    upload              Upload a file or a folder to a repo on the Hub
    repo-files          Manage files in a repo on the Hub
    env                 Print information about the environment.
    login               Log in using a token from huggingface.co/settings/tokens
    whoami              Find out which huggingface.co account you are logged in as.
    logout              Log out
    auth                Other authentication related commands
    repo                {create} Commands to interact with your huggingface.co repos.
    lfs-enable-largefiles
                        Configure your repository to enable upload of files > 5GB.
    scan-cache          Scan cache directory.
    delete-cache        Delete revisions from the cache directory.
    tag                 (create, list, delete) tags for a repo in the hub
    version             Print information about the huggingface-cli version.
    upload-large-folder
                        Upload a large folder to a repo on the Hub

options:
  -h, --help            show this help message and exit

基本使用

各种环境变量

# debug
export HF_DEBUG=1

# 默认缓存位置 ~/.cache/huggingface
export HF_HOME="/data"

设置 huggingface.co 镜像

export HF_ENDPOINT=https://hf-mirror.com

PS 也可以使用 hfd 下载（参考 https://hf-mirror.com/）：

wget https://hf-mirror.com/hfd/hfd.sh
chmod a+x hfd.sh
export HF_ENDPOINT=https://hf-mirror.com

# 下载模型
./hfd.sh gpt2

# 下载数据集
./hfd.sh wikitext --dataset

下载文件

下载单个文件

# 命令行
huggingface-cli download <model_id> --local-dir <local_path>

huggingface-cli download gpt2 config.json model.safetensors
huggingface-cli download openai/gpt2 --local-dir ./gpt2_model

# lib 库
from huggingface_hub import hf_hub_download

hf_hub_download(repo_id="tiiuae/falcon-7b-instruct", filename="config.json")

hf_hub_download(repo_id="google/fleurs", filename="fleurs.py", repo_type="dataset")

# Download from the `v1.0` tag
hf_hub_download(repo_id="lysandre/arxiv-nlp", filename="config.json", revision="v1.0")

# Download from the `test-branch` branch
hf_hub_download(repo_id="lysandre/arxiv-nlp", filename="config.json", revision="test-branch")

# Download from Pull Request #3
hf_hub_download(repo_id="lysandre/arxiv-nlp", filename="config.json", revision="refs/pr/3")

# Download from a specific commit hash
hf_hub_download(repo_id="lysandre/arxiv-nlp", filename="config.json", revision="877b84a8f93f2d619faa2a6e514a32beef88ab0a")

下载整个 repository

from huggingface_hub import snapshot_download

snapshot_download("stabilityai/stable-diffusion-2-1")

查看本地缓存的模型和数据集

huggingface-cli 会将下载的模型和数据集缓存到本地，以便下次快速加载。

bash

huggingface-cli scan-cache
huggingface-cli scan-cache --help

这个命令会显示缓存的总大小以及缓存目录的路径。

清理缓存

如果你想释放磁盘空间，可以清理缓存。

bash

huggingface-cli delete-cache
huggingface-cli delete-cache --help

注意: 这会删除所有缓存的模型和数据集，请谨慎使用。

上传模型

如果你训练了一个新模型，并想将其分享到 Hugging Face Hub，可以使用 upload 命令。你需要先登录 (huggingface-cli login)。

    
    bash
  
huggingface-cli upload <repo_id> <local_path>

# 例如
huggingface-cli upload my-awesome-model ./my_model_folder
huggingface-cli upload Wauplin/my-cool-model ./models/model.safetensors model.safetensors

huggingface-cli lfs-enable-largefiles
huggingface-cli upload-large-folder HuggingFaceM4/Docmatix --repo-type=dataset /path/to/local/docmatix --num-workers=16

这会将 my_model_folder 中的内容上传到 Hugging Face Hub 上你的用户下的 my-awesome-model 仓库。如果仓库不存在，它会尝试创建。

创建新的模型仓库

    
    bash
  
huggingface-cli repo create <repo_id> --type model

# 示例
huggingface-cli repo create my-new-model --type model

创建模型示例

mkdir cnn-demo && cd cnn-demo

# create write token https://huggingface.co/settings/tokens/new?tokenType=write
$ huggingface-cli login
...
/Users/xiexianbin/.cache/huggingface/stored_tokens
/Users/xiexianbin/.cache/huggingface/token
...

$ huggingface-cli repo create cnn-demo

# git lfs install
# git clone https://huggingface.co/xiexianbin/cnn-demo

# mv <path-of>/{cnn-demo.pth,README.md} .
# git add .
# git commit -m "init commit"
# git push

# 或
$ huggingface-cli upload xiexianbin/cnn-demo .
Start hashing 1 files.
Finished hashing 1 files.
cnn-demo.pth: 100%|██████████████████████████████████████████████████████████████████| 210k/210k [00:01<00:00, 118kB/s]
https://huggingface.co/xiexianbin/cnn-demo/tree/main/.

其他常用命令

huggingface-cli env: 显示 Hugging Face 相关的环境变量。
huggingface-cli repo rename <old_repo_id> <new_repo_id>: 重命名仓库。
huggingface-cli repo delete <repo_id>: 删除仓库。

其他

hugging face Storage Xet

总结

huggingface-cli 是一个非常实用的工具，尤其当你需要直接与 HuggingFace Hub 进行文件操作、管理缓存或进行身份验证时。