4 Model persistence
So far, we either loaded a dataset or generated it on the fly. Therefore, we start by looking into ways to persist the models we generated.
The general idea is to simply store the object we generate and load it at some later time. Nevertheless, this can be quite tricky as we will see in the following.
For example it might be that we do our training in a different environment than the inference or prediction. It might even be the case, that we switch programming language for these tasks to extract the best performance.
As we mainly work with scikit-learn
we introduce the concepts with it. Our first step is to check the documentation docs - model persistence and we can use it as reference for this introduction. We introduce different possibilities, all of them have strengths and weaknesses, unfortunately there is no gold standard.
Let us use the following toy example, see Listing 4.1, with our cats and dogs.
import numpy as np
import scipy
import requests
import io
import sklearn
from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline
from sklearn.ensemble import RandomForestClassifier, VotingClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.svm import SVC
= requests.get(
response "https://github.com/dynamicslab/databook_python/"
"raw/refs/heads/master/DATA/catData_w.mat")
= scipy.io.loadmat(io.BytesIO(response.content))["cat_wave"]
cats_w
= requests.get(
response "https://github.com/dynamicslab/databook_python/"
"raw/refs/heads/master/DATA/dogData_w.mat")
= scipy.io.loadmat(io.BytesIO(response.content))["dog_wave"]
dogs_w
= np.concatenate((cats_w[:60, :], dogs_w[:60, :]))
X_train = np.repeat(np.array([1, -1]), 60)
y_train = np.concatenate((cats_w[60:80, :], dogs_w[60:80, :]))
X_test = np.repeat(np.array([1, -1]), 20)
y_test
= make_pipeline(
voting_clf =41),
PCA(n_components
VotingClassifier(=[
estimators"lda", LinearDiscriminantAnalysis()),
("rf", RandomForestClassifier(
(=500,
n_estimators=2,
max_leaf_nodes=6020)),
random_state"svc", SVC(
(="linear",
kernel=True,
probability=6020)),
random_state
],=False,
flatten_transform
)
)
voting_clf.fit(X_train, y_train)= voting_clf.score(X_test, y_test)
score print(f"We have a hard voting score of {score}")
We have a hard voting score of 0.8
In the next couple of exercises we create different version of our model and persist it to storage. Try to keep track of what model version corresponds to which exercise/code block.
We need to install several packages for the following exercises.
All of them can be installed via:
pdm add skl2onnx onnxruntime skops
4.1 Open Neural Network Exchange - ONNX
ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers. LEARN MORE
Source: https://onnx.ai/, accessed 07.03.2025.
The use-case for ONNX is to persist a model without necessarily using the Python object itself. This is especially useful, when the runtime for distributing the model is not Python or a very restricted Python environment.
Now let us see how we can persist the model of Listing 4.1 via ONNX.
from skl2onnx import to_onnx
1= to_onnx(voting_clf, X_train[:1].astype(np.int64))
onx with open("model.onnx", "wb") as f:
f.write(onx.SerializeToString())
- 1
-
Not all data types are supported, so we need to convert our reference training sample to
int64
(potentially increasing storage demands).
As mentioned, the file format is binary so it does not make a lot of sense to actually read the resulting file in plain text but we can have a look at the size.
80K model.onnx
Unfortunately, there is no method to convert back to our scikit-learn
. What we can do is to use the onnxruntime
and see if we still get the same score as before.
import onnxruntime as ort
= ort.InferenceSession("model.onnx")
model = model.get_inputs()[0].name
input_name = model.run(None, {input_name: X_test.astype(np.int64)})
predictions
= sklearn.metrics.accuracy_score(y_test, predictions[0])
score print(f"We have a test_score of for {score} for the recovered model.")
= model.run(None, {input_name: X_train.astype(np.int64)})
predictions = sklearn.metrics.accuracy_score(y_train, predictions[0])
score print(f"We have a train_score of for {score} for the recovered model.")
We have a test_score of for 0.825 for the recovered model.
We have a train_score of for 1.0 for the recovered model.
As we can see, the score is actually better than before, which is odd and definitely not intended.
This is due to the fact, that skl2onnx
is not able to convert all scikit-learn
models exactly. This is especially true for the SVC
class included in our composite model. Therefore, the class is stored with the same weights but slightly different parameters.
Furthermore, if we inspect our predictions output from above a bit more it looks like we have switched to soft voting.
Overall, we can see that ONNX is a way to persist a model such that we can make predictions with it but we do no longer have the Python object. Of course it is possible that we can write our own provided and persist our required models to a better state, see sklearn-onnx
docs Regarding file size, we can already say it is efficient and it provides some independence from our training environment.
4.2 pickle
- Python object serialization
We can also swing the pendulum in the other direction and use the Python standard library pickle
to persist our model.
Before we go into more details, we should emphasise the potential security problem we introduce with pickle
as stated in its own docs:
pickle
module is not secure. Only unpickle data you trust.
It is possible to construct malicious pickle data which will execute arbitrary code during unpickling. Never unpickle data that could have come from an untrusted source, or that could have been tampered with.
Consider signing data with hmac
if you need to ensure that it has not been tampered with.
Safer serialization formats such as json
may be more appropriate if you are processing untrusted data, see Comparison with json.
As pickle
is the native implementation in Python. It is easy to use and works for (almost) all models and configurations. The downside is, that we need to absolutely trust the source of our model. We need to trust all steps from the storage provider, through the network to our own infrastructure.
Furthermore, the environment we load the model into needs to be the same as the one we stored it from. As we have already seen how the dependency hell1 influences our development, we bring import these issues together with the pickle
file.
This notably implies, it is not guaranteed that a model can be loaded with a different scikit-learn
version or let alone a different numpy
version, that is only a sub-dependency of scikit-learn
. Furthermore, if a different hardware is involved there might be problems as well, e.g. a different architecture of an integer or float. As a consequence, if we use pickle
a thorough version control with package management is key!
If we have a model that moves around different processes via the disc or is restored frequently from storage but can not be permanently in storage and therefore performance for loading and storing is of interest we can also use joblib
as a more performant alternative.
Now let us see how we can persist the model of Listing 4.1 as a pickle
file.
from pickle import dump
with open("model.pkl", "wb") as f:
=5) dump(voting_clf, f, protocol
Again, the file format is binary so it does not make a lot of sense to actually read the file in plain text but we can have a look at the size
304K model.pkl
and we can see that the storage demands are higher than for ONNX.
We restore the model via
from pickle import load
with open("model.pkl", "rb") as f:
= load(f)
clf = clf.score(X_test, y_test)
score print(f"We have test_score of {score} after loading the object again.")
= clf.score(X_train, y_train)
score print(f"We have train_score of {score} after loading the object again.")
We have test_score of 0.8 after loading the object again.
We have train_score of 1.0 after loading the object again.
As we can see, the score stays the same and we can deal with the loaded object in the same way as with the original.
4.3 skops.io
- the more secure Python alternative
As an alternative to pickle
we can use skops.io
. It is developed as a secure alternative for pickle
and therefore supports a wide range of objects. The main idea is, that only trusted functions are loaded and not everything included in the file. It is also possible to verify our data before loading it into our program, increasing the security further. Still, it returns the Python object, if it can be loaded, and we can manipulate it in the same fashion as with pickle
.
As a downside, the process is slower and some user defined functions/object might not work as desired. This also implies, that we need to have the same environment for loading as we had for storing the Python object, similar to pickle
.
The interface itself is simple and follows pickle
.
import skops.io as sio
= sio.dump(voting_clf, "model.skops") obj
For comparison, we show the size of the file
26M model.skops
and we can see that this format has a significant higher overhead as the other formats.
Retrieving the model is a two step process, first loading the untrusted types and than loading the verified objects.
1= sio.get_untrusted_types(file="model.skops")
unknown_types for i, a in enumerate(unknown_types):
print(f"Unknown type at {i} is {a}.")
= sio.load("model.skops", trusted=unknown_types)
clf = clf.score(X_test, y_test)
score print(f"We have test_score of {score} after loading the object again.")
= clf.score(X_train, y_train)
score print(f"We have train_score of {score} after loading the object again.")
- 1
-
We should investigate the contents of
unknown_types
, and only load if we trust everything we see.
Unknown type at 0 is sklearn.utils._bunch.Bunch.
We have test_score of 0.8 after loading the object again.
We have train_score of 1.0 after loading the object again.
4.4 Comparison of the different approaches
The docs are doing an excellent job in summarizing the key differences.
Based on the different approaches for model persistence, the key points for each approach can be summarized as follows:
ONNX: It provides a uniform format for persisting any machine learning or deep learning model (other than
scikit-learn
) and is useful for model inference (predictions). It can however, result in compatibility issues with different frameworks.
skops.io
: Trainedscikit-learn
models can be easily shared and put into production usingskops.io
. It is more secure compared to alternate approaches based onpickle
because it does not load arbitrary code unless explicitly asked for by the user. Such code needs to be packaged and importable in the target Python environment.
joblib
: Efficient memory mapping techniques make it faster when using the same persisted model in multiple Python processes when usingmmap_mode="r"
. It also gives easy shortcuts to compress and decompress the persisted object without the need for extra code. However, it may trigger the execution of malicious code when loading a model from an untrusted source as any otherpickle
-based persistence mechanism.
pickle
: It is native to Python and most Python objects can be serialized and deserialized usingpickle
, including custom Python classes and functions as long as they are defined in a package that can be imported in the target environment. Whilepickle
can be used to easily save and loadscikit-learn
models, it may trigger the execution of malicious code while loading a model from an untrusted source.pickle
can also be very efficient memorywise if the model was persisted with protocol=5 but it does not support memory mapping.
cloudpickle
: It has comparable loading efficiency aspickle
andjoblib
(without memory mapping), but offers additional flexibility to serialize custom Python code such aslambda
expressions and interactively defined functions and classes. It might be a last resort to persist pipelines with custom Python components such as asklearn.preprocessing.FunctionTransformer
that wraps a function defined in the training script itself or more generally outside of any importable Python package. Note thatcloudpickle
offers no forward compatibility guarantees and you might need the same version ofcloudpickle
to load the persisted model along with the same version of all the libraries used to define the model. As the other pickle-based persistence mechanisms, it may trigger the execution of malicious code while loading a model from an untrusted source.Source: scikit-learn.org, accessed 07.03.2025.
4.5 Further considerations
Now that we know how to persist our models, or at least hope to do so, we need to talk about how we keep track of our different model versions (parameters, training data, random seeds, etc.).
In the previous exercises we created multiple versions of our model and stored them to disc. If we now look at the different files, do we still know which version corresponds to which code block?
As we experiment with different parameters for our composite method - in pursuit of better results - we’ll likely generate even more model variations. To ensure reproducibility, we need a way to track our models alongside the code and parameters that produces them. This is what we are going to look into in the next section.