Pydantic discriminated unions. Some examples to simplify data structures and ensure type safety

Powerful type validation and discriminated Unions with Pydantic: Simplify data structures and ensure type safety. We are showcasing a few straightforward examples.

Pydantic Python library logo.

What is an union discriminator or tagged unions, and its role in Pydantic?

Well, well, well, look who decided to stroll into the world of Pydantic discriminators! 🕶️ Brace yourselves, folks, because we’re about to take a sarcastic and catchy rollercoaster ride through this wild jungle of coding wonders. Buckle up!

So, what’s all the buzz about Pydantic discriminators? Oh, they’re just the coolest thing since sliced bread, my friends! Picture this: you have a bunch of data models, each with their own quirks and peculiarities. It’s like dealing with a bunch of divas in a high school drama, except instead of gossip, it’s all about attributes and properties. Drama queens, am I right?

Now, let’s say you want to pick out the perfect model from this chaotic ensemble. How on earth are you going to do that? Fear not, because Pydantic discriminators are here to save the day, just like a superhero with an ironic sense of humor. They’re like the Sherlock Holmes of model selection, deducing the perfect fit for you.


How does Pydantic discriminator works?

Pydantic’s discriminator feature allows the definition of data structures with multiple types, using a discriminator field to determine the actual object type. This enables type validation and serialization/deserialization based on the discriminator value, ensuring data integrity and flexibility in representing different types of objects.

Since Pydantic 1.9, we could make use of it. Let’s showcase it in an easy way:

from pydantic import BaseModel, Field, parse_obj_as
from typing import Literal, Union, Annotated

class Tiger(BaseModel):
    animal_type: Literal["tiger"] = "tiger"
    ferocity_scale: float = Field(..., ge=0, le=10)

class Shark(BaseModel):
    animal_type: Literal["shark"] = "shark"
    ferocity_scale: float = Field(..., ge=0, le=10)

class Lion(BaseModel):
    animal_type: Literal["lion"] = "lion"
    ferocity_scale: float

class WildAnimal(BaseModel):
    __root__: Annotated[Union[Tiger, Shark, Lion], Field(..., discriminator='animal_type')]

my_shark = WildAnimal.parse_obj({'animal_type': 'shark', 'ferocity_scale': 5}).__root__
#print(Shark(ferocity_scale=5).json())

# Desarialice
WildAnimal.parse_raw(Shark(ferocity_scale=5).json())
## WildAnimal(__root__=Shark(animal_type='shark', ferocity_scale=5.0))
print(isinstance(my_shark, Shark))
## True

The below polimorfic code example and some other interesting discussions could be found here: https://github.com/pydantic/pydantic/discussions/5785


Pydantic Annotated union discriminator example

But we could use a very simple approach to achieve most of the usage by using the Annotated union.

Animal = Annotated[Union[Tiger, Shark], Field(discriminator='animal_type')]
raw_data = {
    "animal_type": "tiger",
    "ferocity_scale": 6
}
parse_obj_as(Animal, raw_data)
## Tiger(animal_type='tiger', ferocity_scale=6.0)

Get ready for the magic of the Field class, courtesy of Pydantic. It’s armed with a special power called “discriminator.” By setting the discriminator to “pet_type,” we unlock the ability to distinguish between our fantastic creatures. It’s like giving them their own special spotlight!

Hold on tight, because we’re about to venture into the wild lands of raw_data. It holds the secrets of a “pet_type” with the fiery spirit of a “tiger” and a mesmerizing “stripes” count of 6. It’s as if we’re peering into a digital zoo!

And now, it’s showtime! We summon the powerful parse_obj_as to work its coding wizardry. We present it with our regal Animal and the enigmatic raw_data. Abracadabra! With a wave of its wand, the transformation unfolds. The raw data morphs into a stunning representation of our chosen Animal. It’s like a magical metamorphosis!


Example of Polimorfic Base Model

class PolymorphicBaseModel(BaseModel):
    type: str

    _subtypes = dict()

    def __init_subclass__(subcls, type=None, **kwargs):
        super().__init_subclass__(**kwargs)
        if type:
            # n.b. if a subclass declares its own _subtypes dict, it'll take precedence over this one.
            # This would allow us to re-use the same type names across different classes.
            if type in subcls._subtypes:
                raise AttributeError(
                    f"Class {subcls} cannot be registered with polymorphic type='{type}' because it's already registered "
                    f" to {subcls._subtypes[type]}"
                )
            subcls._subtypes[type] = subcls
    @classmethod
    def _convert_to_real_type(cls, data):
        data_type = data.get("type")

        if data_type is None:
            raise ValueError(f"Missing 'type' for {cls}")

        subcls = cls._subtypes.get(data_type)

        if subcls is None:
            raise TypeError(f"Unsupported sub-type: {data_type}")
        if not issubclass(subcls, cls):
            raise TypeError(f"Inferred class {subcls} is not a subclass of {cls}")

        return subcls(**data)

    @classmethod
    def parse_obj(cls, data):
        return cls._convert_to_real_type(data)
    
    
class Animal(PolymorphicBaseModel):
    name: str
    color: str = None

class Cat(Animal, type="cat"):
    type: Literal["cat"] = "cat"
    hairless: bool

class Dog(Animal, type="dog"):
    type: Literal["dog"] = "dog"
    breed: str

cat_instance = Animal.parse_obj({"type":"cat", "hairless": False, "name": "meaw", "color": "black"})
print(isinstance(cat_instance, Cat))
## True

Tthe PolymorphicBaseModel, a base class that sets the stage for polymorphic behavior. It defines a required type attribute and introduces a hidden _subtypes dictionary to keep track of subtypes.

Next, we dive into the init_subclass method, where the magic happens. It allows subclasses to register themselves with a specific polymorphic type. This lets us distinguish between different subtypes within the PolymorphicBaseModel hierarchy.

But hold on, there’s more to uncover! We make use of the _convert_to_real_type method, responsible for converting data to its actual subtype based on the provided type attribute. It checks if the type is valid, finds the corresponding subclass, and ensures it is a valid subclass of the base class.

Finally, we arrive at the parse_obj method, where the true parsing takes place. It serves as the entry point for parsing objects of the polymorphic hierarchy. Using the _convert_to_real_type method, it transforms the data into an instance of the appropriate subclass.

And there you have it! A glimpse into the realm of polymorphic models. It’s a world where base classes and subtypes come together, allowing for flexible and dynamic object parsing. Embrace the power of polymorphism and let your code adapt and evolve with grace!


Pydantic 2: TypeAdapter to parse data into a discriminated union

In Pydantic v2, you can utilize the TypeAdapter to parse data into a discriminated union. However, please note that Pydantic v2 is currently in pre-release, and the module’s current version is v1.7.

Therefore, make sure to upgrade to Pydantic v2 when it becomes available to utilize this feature.

from pydantic import TypeAdapter

adapter = TypeAdapter(Annotated[Union[Child1, Child2], Field(discriminator='type')])

child = adapter.validate_json(my_json_data)


Stay updated on Pydantic and Python tips

Hopefully, this post has helped familiarize you with the usage of unions and discriminators in Pydantic, showcasing some of its functionalities and enabling you to enjoy their benefits.

If you want to stay updated…

Carlos Vecina
Carlos Vecina
Senior Data Scientist at Jobandtalent

Senior Data Scientist at Jobandtalent | AI & Data Science for Business

Related