Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Security Concerns

The security concern with the pickle module in Python revolves around deserialisation of untrusted data. When you use pickle.load() to deserialise a byte stream, the pickle module reconstructs Python objects from that stream by importing any class or function it encounters. A malicious attacker can craft a pickled payload that, when deserialised, can:

  1. Execute arbitrary code: The pickle protocol can be manipulated to cause the deserialiser to import arbitrary modules and call arbitrary functions with arbitrary arguments. This permits an attacker to execute system commands, delete files, or perform any action the Python process running pickle.load() has permissions to carry out. This is commonly referred to as a “deserialisation vulnerability” or “arbitrary code execution.”

  2. Cause Denial of Service (DoS): An attacker could create a pickled object that, when deserialised, consumes excessive memory or CPU resources, leading to your application crashing or becoming unresponsive.

Preventive Measures

The most effective preventive measure is to avoid using pickle for deserialising untrusted data entirely. Where this is not possible, consider the following approaches:

Example

Consider the following malicious pickled payload:

import pickle
import os

class Malicious:
    def __reduce__(self):
        return (os.system, ('rm -rf /',))

malicious_data = pickle.dumps(Malicious())
pickle.loads(malicious_data)  # Executes 'rm -rf /'

In this example, the __reduce__ method returns a tuple instructing the unpickler to call os.system with the argument 'rm -rf /'. When deserialised, this would execute a destructive system command.

Discussion

The root cause of the security issue lies in Python’s dynamic nature: pickle was designed for flexibility, allowing serialisation of complex Python objects, including functions and classes. This flexibility comes at the cost of security when handling untrusted data.

The security weakness is not in pickle itself but in how it is used. The Python documentation explicitly warns against using pickle with untrusted data.

More Information