The shelve module offers a simple, persistent, dictionary-like interface for storing and retrieving Python objects to and from disk. While convenient for basic local persistence, it is built directly on top of the pickle module and inherits all of its serious security risks.
Security Concerns¶
The shelve module uses pickle for both serialization (shelve.sync(), object storage) and deserialization (shelve.open() and item retrieval). This means that loading data from a shelf file can trigger arbitrary code execution if the file originates from an untrusted source.
Key risks include:
Arbitrary code execution: Malicious shelf files can exploit
pickle’s powerful reconstruction capabilities (via__reduce__, custom classes, etc.) to run arbitrary Python code during unpickling.No built-in authentication or validation:
shelveperforms no integrity checks or sandboxing by default.File-based attack surface: Any process with write access to the shelf file (or the ability to supply a malicious
.dbfile) can compromise the application that opens it.Silent failures and hidden dangers: The convenience of the dict-like API can lead developers to treat shelf data as trusted without proper validation.
Preventive Measures¶
Never load untrusted shelves: Avoid using
shelve.open()on files received from users, networks, downloads, or any untrusted location.Prefer safer alternatives:
Use
json,toml,yaml(with safe loaders), or structured formats like Protocol Buffers / MessagePack with schema validation.For more complex object persistence, consider ORMs with proper input sanitisation (e.g., SQLAlchemy) or secure serialization libraries.
Validate and sanitise data: Even when using
shelvefor trusted data, implement strict schema validation after loading.Use with
picklerestrictions: If you must useshelve, restrict thepickleprotocol to the safest level and consider customUnpicklerclasses that limit global imports (though this is complex and not foolproof).File permissions and isolation: Store shelf files with strict permissions and run the application with the principle of least privilege.
Consider
dbmdirectly: For simple key-value storage of bytes/strings, use the lower-leveldbmmodules instead ofshelve.
Example¶
Safe usage (trusted local data only):
import shelve
with shelve.open('local_config.db') as db:
db['user_settings'] = {'theme': 'dark', 'timeout': 30}
settings = db['user_settings'] # Safe because file is trustedDangerous usage (to avoid):
import shelve
# Never do this with files from untrusted sources
with shelve.open('untrusted_data.db') as db: # Arbitrary code execution possible!
malicious_data = db['payload']Discussion¶
Shelve is only appropriate for fully trusted environments — typically single-user applications where the shelf file is never exposed to external input or third parties. In practice this is near to impossible for most use cases.
The fundamental problem with shelve is that it provides an attractive, high-level interface that hides the dangerous pickle implementation underneath. Developers often underestimate the risk because the API feels like a simple dictionary.
Unlike pickle used directly, shelve adds the complication of persistent files that may be tampered with over time or replaced by attackers. This makes it especially dangerous in desktop applications, plugins, or any scenario where files can be swapped.
Modern Python security guidance strongly recommends avoiding pickle-based solutions for anything beyond fully controlled, internal use cases. The convenience of shelve rarely outweighs the long-term maintenance and security burden it introduces.
