mypyc/doc/using_type_annotations.rst - third_party/github.com/python/mypy - Git at Google

 .. _using-type-annotations:

 Using type annotations
 ======================

 You will get the most out of mypyc if you compile code with precise
 type annotations. Not all type annotations will help performance
 equally, however. Using types such as :ref:`primitive types
 <primitive-types>`, :ref:`native classes <native-class-intro>`,
 :ref:`union types <union-types>`, :ref:`trait types <trait-types>`,
 and :ref:`tuple types <tuple-types>` as much as possible is a key to
 major performance gains over CPython.

 In contrast, some other types, including ``Any``, are treated as
 :ref:`erased types <erased-types>`.  Operations on erased types use
 generic operations that work with arbitrary objects, similar to how
 the CPython interpreter works. If you only use erased types, the only
 notable benefits over CPython will be the removal of interpreter
 overhead (from compilation) and a bit of :ref:`early binding
 <early-binding>`, which will usually only give minor performance
 gains.

 .. _primitive-types:

 Primitive types
 ---------------

 The following built-in types are treated as *primitive types* by
 mypyc, and many operations on these types have efficient
 implementations:

 * ``int`` (:ref:`native operations <int-ops>`)
 * ``i64`` (:ref:`documentation <native-ints>`, :ref:`native operations <int-ops>`)
 * ``i32`` (:ref:`documentation <native-ints>`, :ref:`native operations <int-ops>`)
 * ``i16`` (:ref:`documentation <native-ints>`, :ref:`native operations <int-ops>`)
 * ``u8`` (:ref:`documentation <native-ints>`, :ref:`native operations <int-ops>`)
 * ``float`` (:ref:`native operations <float-ops>`)
 * ``bool`` (:ref:`native operations <bool-ops>`)
 * ``str`` (:ref:`native operations <str-ops>`)
 * ``List[T]`` (:ref:`native operations <list-ops>`)
 * ``Dict[K, V]`` (:ref:`native operations <dict-ops>`)
 * ``Set[T]`` (:ref:`native operations <set-ops>`)
 * ``Tuple[T, ...]`` (variable-length tuple; :ref:`native operations <tuple-ops>`)
 * ``None``

 The link after each type lists all supported native, optimized
 operations for the type. You can use all operations supported by
 Python, but *native operations* will have custom, optimized
 implementations.

 Primitive containers
 --------------------

 Primitive container objects such as ``list`` and ``dict`` don't
 maintain knowledge of the item types at runtime -- the item type is
 *erased*.

 This means that item types are checked when items are accessed, not
 when a container is passed as an argument or assigned to another
 variable. For example, here we have a runtime type error on the final
 line of ``example`` (the ``Any`` type means an arbitrary, unchecked
 value)::

     from typing import List, Any

     def example(a: List[Any]) -> None:
         b: List[int] = a  # No error -- items are not checked
         print(b[0])  # Error here -- got str, but expected int

     example(["x"])

 .. _native-class-intro:

 Native classes
 --------------

 Classes that get compiled to C extensions are called native
 classes. Most common operations on instances of these classes are
 optimized, including construction, attribute access and method calls.

 Native class definitions look exactly like normal Python class
 definitions.  A class is usually native if it's in a compiled module
 (though there are some exceptions).

 Consider this example:

 .. code-block::

    class Point:
        def __init__(self, x: int, y: int) -> None:
            self.x = x
            self.y = y

    def shift(p: Point) -> Point:
        return Point(p.x + 1, p.y + 1)

 All operations in the above example use native operations, if the file
 is compiled.

 Native classes have some notable different from Python classes:

 * Only attributes and methods defined in the class body or methods are
   supported.  If you try to assign to an undefined attribute outside
   the class definition, ``AttributeError`` will be raised. This enables
   an efficient memory layout and fast method calls for native classes.

 * Native classes usually don't define the ``__dict__`` attribute (they
   don't have an attribute dictionary). This follows from only having
   a specific set of attributes.

 * Native classes can't have an arbitrary metaclass or use most class
   decorators.

 Native classes only support single inheritance. A limited form of
 multiple inheritance is supported through *trait types*. You generally
 must inherit from another native class (or ``object``). By default,
 you can't inherit a Python class from a native class (but there's
 an :ref:`override <inheritance>` to allow that).

 See :ref:`native-classes` for more details.

 .. _tuple-types:

 Tuple types
 -----------

 Fixed-length
 `tuple types <https://mypy.readthedocs.io/en/stable/kinds_of_types.html#tuple-types>`_
 such as ``Tuple[int, str]`` are represented
 as :ref:`value types <value-and-heap-types>` when stored in variables,
 passed as arguments, or returned from functions. Value types are
 allocated in the low-level machine stack or in CPU registers, as
 opposed to *heap types*, which are allocated dynamically from the
 heap.

 Like all value types, tuples will be *boxed*, i.e. converted to
 corresponding heap types, when stored in Python containers, or passed
 to non-native code. A boxed tuple value will be a regular Python tuple
 object.

 .. _union-types:

 Union types
 -----------

 `Union types <https://mypy.readthedocs.io/en/stable/kinds_of_types.html#union-types>`_
 and
 `optional types <https://mypy.readthedocs.io/en/stable/kinds_of_types.html#optional-types-and-the-none-type>`_
 that contain primitive types, native class types and
 trait types are also efficient. If a union type has
 :ref:`erased <erased-types>` items, accessing items with
 non-erased types is often still quite efficient.

 A value with a union types is always :ref:`boxed <value-and-heap-types>`,
 even if it contains a value that also has an unboxed representation, such
 as an integer or a boolean.

 For example, using ``Optional[int]`` is quite efficient, but the value
 will always be boxed. A plain ``int`` value will usually be faster, since
 it has an unboxed representation.

 .. _trait-types:

 Trait types
 -----------

 Trait types enable a form of multiple inheritance for native classes.
 A native class can inherit any number of traits.  Trait types are
 defined as classes using the ``mypy_extensions.trait`` decorator::

     from mypy_extensions import trait

     @trait
     class MyTrait:
         def method(self) -> None:
             ...

 Traits can define methods, properties and attributes. They often
 define abstract methods. Traits can be generic.

 If a class subclasses both a non-trait class and traits, the traits
 must be placed at the end of the base class list::

     class Base: ...

     class Derived(Base, MyTrait, FooTrait):  # OK
         ...

     class Derived2(MyTrait, FooTrait, Base):
         # Error: traits should come last
         ...

 Traits have some special properties:

 * You shouldn't create instances of traits (though mypyc does not
   prevent it yet).

 * Traits can subclass other traits or native classes, but the MRO must be
   linear (just like with native classes).

 * Accessing methods or attributes through a trait type is somewhat
   less efficient than through a native class type, but this is much
   faster than through Python class types or other
   :ref:`erased types <erased-types>`.

 You need to install ``mypy-extensions`` to use ``@trait``:

 .. code-block:: text

     pip install --upgrade mypy-extensions

 .. _erased-types:

 Erased types
 ------------

 Mypyc supports many other kinds of types as well, beyond those
 described above.  However, these types don't have customized
 operations, and they are implemented using *type erasure*.  Type
 erasure means that all other types are equivalent to untyped values at
 runtime, i.e. they are the equivalent of the type ``Any``. Erased
 types include these:

 * Python classes (including ABCs)

 * Non-mypyc extension types and primitive types (including built-in
   types that are not primitives)

 * `Callable types <https://mypy.readthedocs.io/en/stable/kinds_of_types.html#callable-types-and-lambdas>`_

 * `Type variable types <https://mypy.readthedocs.io/en/stable/generics.html>`_

 * Type `Any <https://mypy.readthedocs.io/en/stable/dynamic_typing.html>`_

 * Protocol types

 Using erased types can still improve performance, since they can
 enable better types to be inferred for expressions that use these
 types.  For example, a value with type ``Callable[[], int]`` will not
 allow native calls. However, the return type is a primitive type, and
 we can use fast operations on the return value::

     from typing import Callable

     def call_and_inc(f: Callable[[], int]) -> int:
         # Slow call, since f has an erased type
         n = f()
         # Fast increment; inferred type of n is int (primitive type)
         n += 1
         return n

 If the type of the argument ``f`` was ``Any``, the type of ``n`` would
 also be ``Any``, resulting in a generic, slower increment operation
 being used.

 Strict runtime type checking
 ----------------------------

 Compiled code ensures that any variable or expression with a
 non-erased type only has compatible values at runtime. This is in
 contrast with using *optional static typing*, such as by using mypy,
 when type annotations are not enforced at runtime. Mypyc ensures
 type safety both statically and at runtime.

 ``Any`` types and erased types in general can compromise type safety,
 and this is by design. Inserting strict runtime type checks for all
 possible values would be too expensive and against the goal of
 high performance.

 .. _value-and-heap-types:

 Value and heap types
 --------------------

 In CPython, memory for all objects is dynamically allocated on the
 heap. All Python types are thus *heap types*. In compiled code, some
 types are *value types* -- no object is (necessarily) allocated on the
 heap.  ``bool``, ``float``, ``None``, :ref:`native integer types <native-ints>`
 and fixed-length tuples are value types.

 ``int`` is a hybrid. For typical integer values, it is a value
 type. Large enough integer values, those that require more than 63
 bits (or 31 bits on 32-bit platforms) to represent, use a heap-based
 representation (same as CPython).

 Value types have a few differences from heap types:

 * When an instance of a value type is used in a context that expects a
   heap value, for example as a list item, it will transparently switch
   to a heap-based representation (boxing) as needed.

 * Similarly, mypyc transparently changes from a heap-based
   representation to a value representation (unboxing).

 * Object identity of integers, floating point values and tuples is not
   preserved. You should use ``==`` instead of ``is`` if you are comparing
   two integers, floats or fixed-length tuples.

 * When an instance of a subclass of a value type is converted to the
   base type, it is implicitly converted to an instance of the target
   type.  For example, a ``bool`` value assigned to a variable with an
   ``int`` type will be converted to the corresponding integer.

 The latter conversion is the only implicit type conversion that
 happens in mypyc programs.

 Example::

     def example() -> None:
         # A small integer uses the value (unboxed) representation
         x = 5
         # A large integer uses the heap (boxed) representation
         x = 2**500
         # Lists always contain boxed integers
         a = [55]
         # When reading from a list, the object is automatically unboxed
         x = a[0]
         # True is converted to 1 on assignment
         x = True

 Since integers and floating point values have a different runtime
 representations and neither can represent all the values of the other
 type, type narrowing of floating point values through assignment is
 disallowed in compiled code. For consistency, mypyc rejects assigning
 an integer value to a float variable even in variable initialization.
 An explicit conversion is required.

 Examples::

     def narrowing(n: int) -> None:
         # Error: Incompatible value representations in assignment
         # (expression has type "int", variable has type "float")
         x: float = 0

         y: float = 0.0  # Ok

         if f():
             y = n  # Error
         if f():
             y = float(n)  # Ok

 .. _native-ints:

 Native integer types
 --------------------

 You can use the native integer types ``i64`` (64-bit signed integer),
 ``i32`` (32-bit signed integer), ``i16`` (16-bit signed integer), and
 ``u8`` (8-bit unsigned integer) if you know that integer values will
 always fit within fixed bounds. These types are faster than the
 arbitrary-precision ``int`` type, since they don't require overflow
 checks on operations. They may also use less memory than ``int``
 values. The types are imported from the ``mypy_extensions`` module
 (installed via ``pip install mypy_extensions``).

 Example::

     from mypy_extensions import i64

     def sum_list(l: list[i64]) -> i64:
         s: i64 = 0
         for n in l:
             s += n
         return s

     # Implicit conversions from int to i64
     print(sum_list([1, 3, 5]))

 .. note::

   Since there are no overflow checks when performing native integer
   arithmetic, the above function could result in an overflow or other
   undefined behavior if the sum might not fit within 64 bits.

   The behavior when running as interpreted Python program will be
   different if there are overflows. Declaring native integer types
   have no effect unless code is compiled. Native integer types are
   effectively equivalent to ``int`` when interpreted.

 Native integer types have these additional properties:

 * Values can be implicitly converted between ``int`` and a native
   integer type (both ways).

 * Conversions between different native integer types must be explicit.
   A conversion to a narrower native integer type truncates the value
   without a runtime overflow check.

 * If a binary operation (such as ``+``) or an augmented assignment
   (such as ``+=``) mixes native integer and ``int`` values, the
   ``int`` operand is implicitly coerced to the native integer type
   (native integer types are "sticky").

 * You can't mix different native integer types in binary
   operations. Instead, convert between types explicitly.

 For more information about native integer types, refer to
 :ref:`native integer operations <int-ops>`.
	.. _using-type-annotations:

	Using type annotations
	======================

	You will get the most out of mypyc if you compile code with precise
	type annotations. Not all type annotations will help performance
	equally, however. Using types such as :ref:`primitive types
	<primitive-types>`, :ref:`native classes <native-class-intro>`,
	:ref:`union types <union-types>`, :ref:`trait types <trait-types>`,
	and :ref:`tuple types <tuple-types>` as much as possible is a key to
	major performance gains over CPython.

	In contrast, some other types, including ``Any``, are treated as
	:ref:`erased types <erased-types>`. Operations on erased types use
	generic operations that work with arbitrary objects, similar to how
	the CPython interpreter works. If you only use erased types, the only
	notable benefits over CPython will be the removal of interpreter
	overhead (from compilation) and a bit of :ref:`early binding
	<early-binding>`, which will usually only give minor performance
	gains.

	.. _primitive-types:

	Primitive types
	---------------

	The following built-in types are treated as primitive types by
	mypyc, and many operations on these types have efficient
	implementations:

	* ``int`` (:ref:`native operations <int-ops>`)
	* ``i64`` (:ref:`documentation <native-ints>`, :ref:`native operations <int-ops>`)
	* ``i32`` (:ref:`documentation <native-ints>`, :ref:`native operations <int-ops>`)
	* ``i16`` (:ref:`documentation <native-ints>`, :ref:`native operations <int-ops>`)
	* ``u8`` (:ref:`documentation <native-ints>`, :ref:`native operations <int-ops>`)
	* ``float`` (:ref:`native operations <float-ops>`)
	* ``bool`` (:ref:`native operations <bool-ops>`)
	* ``str`` (:ref:`native operations <str-ops>`)
	* ``List[T]`` (:ref:`native operations <list-ops>`)
	* ``Dict[K, V]`` (:ref:`native operations <dict-ops>`)
	* ``Set[T]`` (:ref:`native operations <set-ops>`)
	* ``Tuple[T, ...]`` (variable-length tuple; :ref:`native operations <tuple-ops>`)
	* ``None``

	The link after each type lists all supported native, optimized
	operations for the type. You can use all operations supported by
	Python, but native operations will have custom, optimized
	implementations.

	Primitive containers
	--------------------

	Primitive container objects such as ``list`` and ``dict`` don't
	maintain knowledge of the item types at runtime -- the item type is
	erased.

	This means that item types are checked when items are accessed, not
	when a container is passed as an argument or assigned to another
	variable. For example, here we have a runtime type error on the final
	line of ``example`` (the ``Any`` type means an arbitrary, unchecked
	value)::

	from typing import List, Any

	def example(a: List[Any]) -> None:
	b: List[int] = a # No error -- items are not checked
	print(b[0]) # Error here -- got str, but expected int

	example(["x"])

	.. _native-class-intro:

	Native classes
	--------------

	Classes that get compiled to C extensions are called native
	classes. Most common operations on instances of these classes are
	optimized, including construction, attribute access and method calls.

	Native class definitions look exactly like normal Python class
	definitions. A class is usually native if it's in a compiled module
	(though there are some exceptions).

	Consider this example:

	.. code-block::

	class Point:
	def __init__(self, x: int, y: int) -> None:
	self.x = x
	self.y = y

	def shift(p: Point) -> Point:
	return Point(p.x + 1, p.y + 1)

	All operations in the above example use native operations, if the file
	is compiled.

	Native classes have some notable different from Python classes:

	* Only attributes and methods defined in the class body or methods are
	supported. If you try to assign to an undefined attribute outside
	the class definition, ``AttributeError`` will be raised. This enables
	an efficient memory layout and fast method calls for native classes.

	* Native classes usually don't define the ``__dict__`` attribute (they
	don't have an attribute dictionary). This follows from only having
	a specific set of attributes.

	* Native classes can't have an arbitrary metaclass or use most class
	decorators.

	Native classes only support single inheritance. A limited form of
	multiple inheritance is supported through trait types. You generally
	must inherit from another native class (or ``object``). By default,
	you can't inherit a Python class from a native class (but there's
	an :ref:`override <inheritance>` to allow that).

	See :ref:`native-classes` for more details.

	.. _tuple-types:

	Tuple types
	-----------

	Fixed-length
	`tuple types <https://mypy.readthedocs.io/en/stable/kinds_of_types.html#tuple-types>`_
	such as ``Tuple[int, str]`` are represented
	as :ref:`value types <value-and-heap-types>` when stored in variables,
	passed as arguments, or returned from functions. Value types are
	allocated in the low-level machine stack or in CPU registers, as
	opposed to heap types, which are allocated dynamically from the
	heap.

	Like all value types, tuples will be boxed, i.e. converted to
	corresponding heap types, when stored in Python containers, or passed
	to non-native code. A boxed tuple value will be a regular Python tuple
	object.

	.. _union-types:

	Union types
	-----------

	`Union types <https://mypy.readthedocs.io/en/stable/kinds_of_types.html#union-types>`_
	and
	`optional types <https://mypy.readthedocs.io/en/stable/kinds_of_types.html#optional-types-and-the-none-type>`_
	that contain primitive types, native class types and
	trait types are also efficient. If a union type has
	:ref:`erased <erased-types>` items, accessing items with
	non-erased types is often still quite efficient.

	A value with a union types is always :ref:`boxed <value-and-heap-types>`,
	even if it contains a value that also has an unboxed representation, such
	as an integer or a boolean.

	For example, using ``Optional[int]`` is quite efficient, but the value
	will always be boxed. A plain ``int`` value will usually be faster, since
	it has an unboxed representation.

	.. _trait-types:

	Trait types
	-----------

	Trait types enable a form of multiple inheritance for native classes.
	A native class can inherit any number of traits. Trait types are
	defined as classes using the ``mypy_extensions.trait`` decorator::

	from mypy_extensions import trait

	@trait
	class MyTrait:
	def method(self) -> None:
	...

	Traits can define methods, properties and attributes. They often
	define abstract methods. Traits can be generic.

	If a class subclasses both a non-trait class and traits, the traits
	must be placed at the end of the base class list::

	class Base: ...

	class Derived(Base, MyTrait, FooTrait): # OK
	...

	class Derived2(MyTrait, FooTrait, Base):
	# Error: traits should come last
	...

	Traits have some special properties:

	* You shouldn't create instances of traits (though mypyc does not
	prevent it yet).

	* Traits can subclass other traits or native classes, but the MRO must be
	linear (just like with native classes).

	* Accessing methods or attributes through a trait type is somewhat
	less efficient than through a native class type, but this is much
	faster than through Python class types or other
	:ref:`erased types <erased-types>`.

	You need to install ``mypy-extensions`` to use ``@trait``:

	.. code-block:: text

	pip install --upgrade mypy-extensions

	.. _erased-types:

	Erased types
	------------

	Mypyc supports many other kinds of types as well, beyond those
	described above. However, these types don't have customized
	operations, and they are implemented using type erasure. Type
	erasure means that all other types are equivalent to untyped values at
	runtime, i.e. they are the equivalent of the type ``Any``. Erased
	types include these:

	* Python classes (including ABCs)

	* Non-mypyc extension types and primitive types (including built-in
	types that are not primitives)

	* `Callable types <https://mypy.readthedocs.io/en/stable/kinds_of_types.html#callable-types-and-lambdas>`_

	* `Type variable types <https://mypy.readthedocs.io/en/stable/generics.html>`_

	* Type `Any <https://mypy.readthedocs.io/en/stable/dynamic_typing.html>`_

	* Protocol types

	Using erased types can still improve performance, since they can
	enable better types to be inferred for expressions that use these
	types. For example, a value with type ``Callable[[], int]`` will not
	allow native calls. However, the return type is a primitive type, and
	we can use fast operations on the return value::

	from typing import Callable

	def call_and_inc(f: Callable[[], int]) -> int:
	# Slow call, since f has an erased type
	n = f()
	# Fast increment; inferred type of n is int (primitive type)
	n += 1
	return n

	If the type of the argument ``f`` was ``Any``, the type of ``n`` would
	also be ``Any``, resulting in a generic, slower increment operation
	being used.

	Strict runtime type checking
	----------------------------

	Compiled code ensures that any variable or expression with a
	non-erased type only has compatible values at runtime. This is in
	contrast with using optional static typing, such as by using mypy,
	when type annotations are not enforced at runtime. Mypyc ensures
	type safety both statically and at runtime.

	``Any`` types and erased types in general can compromise type safety,
	and this is by design. Inserting strict runtime type checks for all
	possible values would be too expensive and against the goal of
	high performance.

	.. _value-and-heap-types:

	Value and heap types
	--------------------

	In CPython, memory for all objects is dynamically allocated on the
	heap. All Python types are thus heap types. In compiled code, some
	types are value types -- no object is (necessarily) allocated on the
	heap. ``bool``, ``float``, ``None``, :ref:`native integer types <native-ints>`
	and fixed-length tuples are value types.

	``int`` is a hybrid. For typical integer values, it is a value
	type. Large enough integer values, those that require more than 63
	bits (or 31 bits on 32-bit platforms) to represent, use a heap-based
	representation (same as CPython).

	Value types have a few differences from heap types:

	* When an instance of a value type is used in a context that expects a
	heap value, for example as a list item, it will transparently switch
	to a heap-based representation (boxing) as needed.

	* Similarly, mypyc transparently changes from a heap-based
	representation to a value representation (unboxing).

	* Object identity of integers, floating point values and tuples is not
	preserved. You should use ``==`` instead of ``is`` if you are comparing
	two integers, floats or fixed-length tuples.

	* When an instance of a subclass of a value type is converted to the
	base type, it is implicitly converted to an instance of the target
	type. For example, a ``bool`` value assigned to a variable with an
	``int`` type will be converted to the corresponding integer.

	The latter conversion is the only implicit type conversion that
	happens in mypyc programs.

	Example::

	def example() -> None:
	# A small integer uses the value (unboxed) representation
	x = 5
	# A large integer uses the heap (boxed) representation
	x = 2**500
	# Lists always contain boxed integers
	a = [55]
	# When reading from a list, the object is automatically unboxed
	x = a[0]
	# True is converted to 1 on assignment
	x = True

	Since integers and floating point values have a different runtime
	representations and neither can represent all the values of the other
	type, type narrowing of floating point values through assignment is
	disallowed in compiled code. For consistency, mypyc rejects assigning
	an integer value to a float variable even in variable initialization.
	An explicit conversion is required.

	Examples::

	def narrowing(n: int) -> None:
	# Error: Incompatible value representations in assignment
	# (expression has type "int", variable has type "float")
	x: float = 0

	y: float = 0.0 # Ok

	if f():
	y = n # Error
	if f():
	y = float(n) # Ok

	.. _native-ints:

	Native integer types
	--------------------

	You can use the native integer types ``i64`` (64-bit signed integer),
	``i32`` (32-bit signed integer), ``i16`` (16-bit signed integer), and
	``u8`` (8-bit unsigned integer) if you know that integer values will
	always fit within fixed bounds. These types are faster than the
	arbitrary-precision ``int`` type, since they don't require overflow
	checks on operations. They may also use less memory than ``int``
	values. The types are imported from the ``mypy_extensions`` module
	(installed via ``pip install mypy_extensions``).

	Example::

	from mypy_extensions import i64

	def sum_list(l: list[i64]) -> i64:
	s: i64 = 0
	for n in l:
	s += n
	return s

	# Implicit conversions from int to i64
	print(sum_list([1, 3, 5]))

	.. note::

	Since there are no overflow checks when performing native integer
	arithmetic, the above function could result in an overflow or other
	undefined behavior if the sum might not fit within 64 bits.

	The behavior when running as interpreted Python program will be
	different if there are overflows. Declaring native integer types
	have no effect unless code is compiled. Native integer types are
	effectively equivalent to ``int`` when interpreted.

	Native integer types have these additional properties:

	* Values can be implicitly converted between ``int`` and a native
	integer type (both ways).

	* Conversions between different native integer types must be explicit.
	A conversion to a narrower native integer type truncates the value
	without a runtime overflow check.

	* If a binary operation (such as ``+``) or an augmented assignment
	(such as ``+=``) mixes native integer and ``int`` values, the
	``int`` operand is implicitly coerced to the native integer type
	(native integer types are "sticky").

	* You can't mix different native integer types in binary
	operations. Instead, convert between types explicitly.

	For more information about native integer types, refer to
	:ref:`native integer operations <int-ops>`.