The model cache¶

DSS has its own (optional) managed cache to store models from Hugging Face.

If enabled at the connection level, the cache is automatically populated when using the LLM Mesh with pre-trained models from Hugging Face. Models are downloaded from Hugging Face Hub.

This cache is especially useful for air-gapped instances, where models need to be imported into DSS before they can be used through a Local Hugging Face connection.

Import and export models¶

In DSS, the model cache can be managed from Administration > Settings > Other > LLM Mesh > Model cache.

From this page, you can:

Monitor the disk usage of the model cache
View the cached models
Delete models
Export models
Import models

If your DSS instance does not have access to Hugging Face (huggingface.co), you can manually import a model archive, typically one exported from the model cache of another DSS design or automation node with network access.

Build your own model archive to import¶

Note

It is simpler (and recommended) to import a model that was previously exported by DSS when that is possible, instead of creating an archive manually.

If you want to manually create an archive it should contain the following structure:

a root folder
a folder named model containing the Hugging Face model repo content
a file named model_metadata.json

To retrieve the model folder content from Hugging Face Hub:

git lfs install
git clone https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

Example of a model archive content:

sentence-transformers_2fall-MiniLM-L6-v2 ← folder at the root (its name is not important)
├── model ← a folder named ``model`` containing the Hugging Face model repo content
│   ├── 1_Pooling
│   │   └── config.json
│   ├── README.md
│   ├── config.json
│   ├── config_sentence_transformers.json
│   ├── data_config.json
│   ├── modules.json
│   ├── pytorch_model.bin
│   ├── sentence_bert_config.json
│   ├── special_tokens_map.json
│   ├── tokenizer.json
│   ├── tokenizer_config.json
│   ├── train_script.py
│   └── vocab.txt
└── model_metadata.json ← a file named ``model_metadata.json``

The model_metadata.json file should have the following schema:

{
    "commitHash": "7dbbc90392e2f80f3d3c277d6e90027e55de9125",
    "downloadDate": 1698300636139,
    "downloadedBy": "admin",
    "lastDssUsage": 1699570884724,
    "lastModified": "2022-11-07T08:44:33.000Z",
    "lastUsedBy": "admin",
    "libraryName": "sentence-transformers",
    "modelDefinition": {
        "key": "hf@sentence-transformers/all-MiniLM-L6-v2"
    },
    "pipelineName": "sentence-similarity",
    "sizeInBytes": 91652688,
    "taggedLanguages": [
        "en"
    ],
    "tags": [
        "sentence-transformers",
        "pytorch",
        "tf",
        "rust",
        "bert",
        "feature-extraction",
        "sentence-similarity",
        "en",
        "dataset:s2orc",
        "dataset:flax-sentence-embeddings/stackexchange_xml",
        ...
        "arxiv:2104.08727",
        "arxiv:1704.05179",
        "arxiv:1810.09305",
        "license:apache-2.0",
        "endpoints_compatible",
        "has_space",
        "region:us"
    ],
    "url": "https://huggingface.co/sentence-transformers%2Fall-MiniLM-L6-v2/tree/main",
    "version": 0
}

Most of these fields can be retrieved from the Hugging Face model repository.

The important ones are:

modelDefinition:
- key: consists of hf@<modelId> or hf@<modelId>@<revision> if a specific revision was used
version: as of now should be 0
url: the url used to fetch the model

Access cache programmatically¶

You can access models in the DSS-managed model cache programmatically using the following code:

from dataiku.core.model_provider import get_model_from_cache
model_path_in_cache = get_model_from_cache(model_name)

The Python code shown above will work both in a local execution and in a containerized execution. It expects the model to be in the cache, it will not trigger its download to the cache.

To download a model from Hugging Face to the DSS-managed model cache programmatically, you can use the following code:

from dataiku.core.model_provider import download_model_to_cache
download_model_to_cache(model_name)

If the model is not already in the cache, this code downloads the model from Hugging Face and stores it in the DSS-managed model cache. If the user running this code is not an administrator, the specified model must be enabled in a Hugging Face connection.

If the model requires a Hugging Face access token, you can provide a connection with a configured access token to use as an optional second argument:

from dataiku.core.model_provider import download_model_to_cache
download_model_to_cache(model_name, connection_name=your_connection)