Type Checking#
Have uv? ⚡
If you have uv installed, you can instantly open this page as a Jupyter notebook using opennb:
uvx --with "pipefunc[docs]" opennb pipefunc/pipefunc/docs/source/concepts/type-checking.md
This command creates an ephemeral environment with all dependencies and launches the notebook in your browser in 1 second - no manual setup needed! ✨.
Alternatively, run:
uv run https://raw.githubusercontent.com/pipefunc/pipefunc/refs/heads/main/get-notebooks.py
to download all documentation as Jupyter notebooks.
How does type checking work in pipefunc?#
pipefunc supports type checking for function arguments and outputs using Python type hints.
It ensures that the output of one function matches the expected input types of the next function in the pipeline.
This is crucial for maintaining data integrity and catching errors early in pipeline-based workflows.
Basic type checking#
Here’s an example of pipefunc raising a TypeError when the types don’t match:
from pipefunc import Pipeline, pipefunc
# All type hints that are not relevant for this example are omitted!
@pipefunc(output_name="y")
def f(a) -> int: # output 'y' is expected to be an `int`
return 2 * a
@pipefunc(output_name="z")
def g(y: str): # here 'y' is expected to be a `str`
return y.upper()
# Creating the `Pipeline` will raise a `TypeError`
pipeline = Pipeline([f, g])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[1], line 14
10 def g(y: str): # here 'y' is expected to be a `str`
11 return y.upper()
12
13 # Creating the `Pipeline` will raise a `TypeError`
---> 14 pipeline = Pipeline([f, g])
File ~/checkouts/readthedocs.org/user_builds/pipefunc/checkouts/stable/pipefunc/_pipeline/_base.py:209, in Pipeline.__init__(self, functions, lazy, debug, print_error, profile, cache_type, cache_kwargs, validate_type_annotations, scope, default_resources, name, description)
207 else:
208 mapspec = None
--> 209 self.add(f, mapspec=mapspec)
210 self._cache_type = cache_type
211 self._cache_kwargs = cache_kwargs
File ~/checkouts/readthedocs.org/user_builds/pipefunc/checkouts/stable/pipefunc/_pipeline/_base.py:356, in Pipeline.add(self, f, mapspec)
353 f.print_error = self.print_error
355 self._clear_internal_cache() # reset cache
--> 356 self.validate()
357 return f
File ~/checkouts/readthedocs.org/user_builds/pipefunc/checkouts/stable/pipefunc/_pipeline/_base.py:1609, in Pipeline.validate(self)
1607 self._validate_mapspec()
1608 if self.validate_type_annotations:
-> 1609 validate_consistent_type_annotations(self.graph)
File ~/checkouts/readthedocs.org/user_builds/pipefunc/checkouts/stable/pipefunc/_pipeline/_validation.py:97, in validate_consistent_type_annotations(graph)
86 if not is_type_compatible(output_type, input_type):
87 msg = (
88 f"Inconsistent type annotations for:"
89 f"\n - Argument `{parameter_name}`"
(...) 95 " Disable this check by setting `validate_type_annotations=False`."
96 )
---> 97 raise TypeError(msg)
TypeError: Inconsistent type annotations for:
- Argument `y`
- Function `f(...)` returns:
`<class 'int'>`.
- Function `g(...)` expects:
`<class 'str'>`.
Please make sure the shared input arguments have the same type.
Note that the output type displayed above might be wrapped in `pipefunc.typing.Array` if using `MapSpec`s. Disable this check by setting `validate_type_annotations=False`.
In this example, function f outputs an int, but function g expects a str input.
When we try to create the pipeline, it will raise a TypeError due to this type mismatch.
Note
pipefunc only checks the type hints during pipeline construction, not during function execution.
However, soon we will add runtime type checking as an option.
To turn off this type checking, you can set the validate_type_annotations argument to False in the Pipeline constructor:
pipeline = Pipeline([f, g], validate_type_annotations=False)
Note that while disabling type checking allows the pipeline to run, it may lead to runtime errors or unexpected results if the types are not compatible.
Type checking for Pipelines with MapSpec and reductions#
When a pipeline contains a reduction operation (using MapSpecs), the type checking is more complex.
The results of a ND map operation are always stored in a numpy object array, which means that the original types are preserved in the elements of this array.
This means the type hints for the function should be numpy.ndarray[Any, np.dtype[numpy.object_]].
Unfortunately, it is not possible to statically check the types of the elements in the object array (e.g., with mypy).
We can however, check the types of the elements at runtime.
To do this, we can use the Array type hint from pipefunc.typing.
This Array generic contains the correct numpy.ndarray type hint for object arrays, but is annotated with the element type using typing.Annotated.
When using e.g., Array[int], the type hint is numpy.ndarray[Any, np.dtype[numpy.object_]] with the element type int in the metadata of Annotated.
MyPy will ensure the numpy array type, however, PipeFunc will ensure both the numpy object array and its element type.
Use it like this:
import numpy as np
from pipefunc import Pipeline, pipefunc
from pipefunc.typing import Array
@pipefunc(output_name="y", mapspec="x[i] -> y[i]")
def double_it(x: int) -> int:
assert isinstance(x, int)
return 2 * x
@pipefunc(output_name="sum")
def take_sum(y: Array[int]) -> int:
# y is a numpy object array of integers
# the original types are always preserved!
assert isinstance(y, np.ndarray)
assert isinstance(y.dtype, object)
assert isinstance(y[0], int)
return sum(y)
pipeline_map = Pipeline([double_it, take_sum])
pipeline_map.map({"x": [1, 2, 3]})
{'y': Result(function='double_it', kwargs={'x': [1, 2, 3]}, output_name='y', output=array([2, 4, 6], dtype=object), store=DictArray(folder=None, shape=(3,), internal_shape=(), shape_mask=(True,), mapping={(0,): 2, (1,): 4, (2,): 6})), 'sum': Result(function='take_sum', kwargs={'y': DictArray(folder=None, shape=(3,), internal_shape=(), shape_mask=(True,), mapping={(0,): 2, (1,): 4, (2,): 6})}, output_name='sum', output=12, store=<pipefunc.map._result.DirectValue object at 0x7adc5509cf10>)}
For completeness, this is the type hint for Array[int]:
from pipefunc.typing import Array
Array[int]
typing.Annotated[numpy.ndarray[typing.Any, numpy.dtype[numpy.object_]], pipefunc.typing.ArrayElementType[int]]
Static type checking and IDE support#
Everything above is about the runtime type validation that pipefunc performs when constructing a Pipeline.
In addition, the @pipefunc decorator preserves the wrapped function’s signature for static type checkers (mypy, pyright) and IDEs.
A decorated function is a PipeFunc[P, R] instance, where P captures the original parameters and R the return type:
@pipefunc(output_name="c")
def add(a: int, b: float) -> float:
"""Add two numbers together."""
return a + b
reveal_type(add) # PipeFunc[(a: int, b: float), float]
reveal_type(add(1, 2.0)) # float
add("wrong", "types") # error: incompatible argument types
add.update_renames({"a": "x"}) # PipeFunc methods remain fully typed
This means your IDE shows the original parameter names and types in autocompletion and signature help, type checkers validate calls to the decorated function, and the original docstring is available via help(add) and Jupyter’s add?.
Runtime introspection also works as expected — inspect.signature reflects any renames, defaults, and bound arguments:
import inspect
from pipefunc import pipefunc
@pipefunc(output_name="c", renames={"a": "x"})
def add(a: int, b: float) -> float:
"""Add two numbers together."""
return a + b
inspect.signature(add)
<Signature (x: int, b: float) -> float>
Note
Limitation: static type checkers always see the original function signature.
Features that rewrite the signature at runtime — renames, scope, and parameters added or removed via update_defaults/update_bound — cannot be expressed statically (ParamSpec captures the signature at decoration time).
In the example above, calling add(x=1, b=2.0) is correct at runtime but will be flagged by a type checker, which expects a.
This applies to unannotated functions too: the parameter names are captured even without type hints.
If this comes up in your code, add a targeted # type: ignore[call-arg] (mypy) or # pyright: ignore[reportCallIssue] comment to the affected calls.
Calls through pipeline(...), pipeline.run(...), and pipeline.map(...) are unaffected.