File Handling¶
File handling in ORMs is notoriously challenging. Edgy aims to simplify and secure this process.
Edgy implements three security layers:
- Restricting Image Formats: For
ImageFields
, Edgy restricts the image formats that can be parsed, mitigating potential security vulnerabilities. - Approved-Only File Opening: Edgy provides an option for approved-only file opening, ensuring that potentially dangerous files are not automatically processed.
- Direct Size Access: Edgy offers direct size access as a field, simplifying quota management and eliminating the need for manual size tracking. This feature is configurable.
To align with ORM best practices, Edgy employs a staging and non-overwrite concept for enhanced safety. This prevents file name clashes and ensures that files are not overwritten if the save process fails.
Edgy uses process PID prefixing and thread-safe name reservation, committing file changes only after saving the model instance. This eliminates concerns about file overwrites, regardless of whether you're using processes or threads.
Edgy provides three relevant file classes:
- File: A base class for raw I/O operations on file-like objects, offering various helper functions.
- ContentFile: An in-memory I/O class that transforms bytes on the fly and can also be used with files.
- FieldFile: A transactional handler for field operations, used in
FileField
and its subclasses.
Configuration¶
File handling is configured through global settings, which can be found in edgy/conf/global_settings.py
.
Direct Access¶
Direct file access is possible through the storages
object in edgy.files
. This allows you to access files directly using a storage of your choice, obtaining URLs or paths for direct file access. This approach minimizes the impact on other parts of your application.
However, direct access provides limited transactional file handling. For more control, use the save
method explicitly.
Fields¶
FileFields
and ImageFields
are the recommended way to handle files within database tables. ImageFields
are a subclass of FileFields
with additional image-related extensions.
Edgy fields support a multi-store concept, allowing you to use multiple storage backends within the same field. The storage name is saved alongside the file name, enabling retrieval of the correct storage later.
FileField¶
FileField
allows you to save files alongside your database records. Edgy handles file operations automatically, including cleanups when files are unreferenced.
Edgy also prevents overwriting files by default, even in the event of failed save operations.
Setting a FileField
to None
implicitly deletes the file after saving.
For finer control, you can use the methods of the FieldFile
class.
Tip
Set null=True
to allow file deletion and maintain a consistent state. Use delete(instant=True)
to bypass transactional file handling and delete files immediately (use with caution).
Parameters¶
storage
: (Default:default
) The default storage to use.with_size
: (Default:True
) Enable the size field.with_metadata
: (Default:True
) Enable the metadata field.with_approval
: (Default:False
) Enable approval logic.extract_mime
: (Default:True
) Save MIME type in metadata. Set to"approved_only"
to do this only for approved files.mime_use_magic
: (Default:False
) Use thepython-magic
library to get the MIME type.field_file_class
: Provide a customFieldFile
class.generate_name_fn
: Customize name generation.multi_process_safe
: (Default:True
) Prefix name with process ID.
Tip
Disable process PID prefixing with multi_process_safe=False
.
Note
Process PID prefixing requires all processes to be in the same process namespace. Use generate_name_fn
to add unique identifiers.
FieldFile¶
FieldFile
tracks changes and provides a file-like interface.
You can manipulate files by setting a file-like object or None
, or use the following methods:
save(content, *, name="", delete_old=True, multi_process_safe=None, approved=None, storage=None, overwrite=None)
: Save a file.delete(*, instant, approved)
: Stage file deletion.set_approved(bool)
: Set the approved flag.
content
supports file-like objects, bytes, and File
instances.
Tip
Overwrite files with overwrite=True
and the old file name.
Tip
Set the approved flag during save or delete.
Tip
Use multi_process_safe
and overwrite
together by specifying both parameters explicitly.
Metadata¶
FileField
metadata includes:
mime
If with_size
is True
, the file size is available in the database as <file_field_name>_size
.
ImageField¶
ImageField
extends FileField
for image handling, allowing you to restrict loaded image formats.
Extra Parameters¶
with_approval
: (Default:True
) Enable approval logic.image_formats
: (Default:[]
) Allowed Pillow formats for non-approved files.approved_image_formats
: (Default:None
) Allowed Pillow formats for approved files.
ImageFieldFile¶
ImageFieldFile
is a subclass of FieldFile
with an additional method:
open_image
: Opens the file as a PILImageFile
.
Note
open_image
honors the format restrictions specified by ImageField
.
Metadata¶
ImageField
metadata includes:
mime
height
width
The file size is also available, similar to FileField
.
Concepts¶
Quota¶
Use the size field to calculate storage usage efficiently.
import os
from typing import Any
from sqlalchemy import func
import edgy
from edgy.exceptions import FileOperationError
models = edgy.Registry(database=...)
class Document(edgy.StrictModel):
file: edgy.files.FieldFile = edgy.fields.FileField(with_size=True)
class Meta:
registry = models
async def main():
document = await Document.query.create(file=b"abc")
document2 = await Document.query.create(file=b"aabc")
document3 = await Document.query.create(file=b"aabcc")
sum_of_size = await Document.database.fetch_val(func.sum(Document.columns.file_size))
Metadata¶
Metadata is stored in a JSONField
. Extend metadata by subclassing fields and providing a custom extract_metadata
method.
The metadata column is named <file_or_image_field_name>_mdata
.
Approval Concept¶
The with_approval
parameter enables per-file approval. Non-approved files have limited attribute extraction.
ImageField
allows specifying image formats that are processed regardless of approval.
Third-party applications can check the <file_or_image_field_name>_approved
field or <file_or_image_field_name>_ok
column to determine file approval status.