Diverged evaluation

Rationale

Data analysis pipelines can have diverged processing steps, where a specific function is applied repeatedly to each of many individual data items (e.g., enhancing each image in a stack of images). In such diverged steps, the calculation of each data item could be done independently, and we may only want to calculate some and not all of the items at a given time. Furthermore, changes to upstream parameters may only affect the calculations of some of the data items while any cached calculations of other items remain valid (e.g., changing an enhancement parameter specific for one image will require repeating the processing of this image alone). We therefore need ways to independently calculate, cache and track the validity of each data item in such diverged analysis steps. In Quibbler, such independent processing and tracking is automatically enabled when we use the NumPy syntax of vectorize and apply_along_axis.

Applying such NumPy vectorized functions to quib arguments creates a vectorized function quib whose output array is calculated, cached and invalidated not as a whole but rather element-by-element, or slice by slice.

Quickly reviewing the standard behavior of np.vectorize

NumPy’s np.vectorize provides a standard syntax to vectorize a given function such that when applied to array arguments it creates a new array by acting repeatedly on each element of the array arguments (or across slices thereof, see the signature kwarg).

For example:

# Import Quibbler:
import pyquibbler as qb
from pyquibbler import iquib, q
qb.initialize_quibbler()

# Other imports:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib tk
def my_sqr(x):
    print(f'calculating my_sqr of x = {x}')
    return x ** 2


v_my_sqr = np.vectorize(my_sqr, otypes=[int])

In this example, v_my_sqr is the vectorized form of my_sqr; when v_my_sqr is applied to an array, it executes the underlying function my_sqr on each element of the input array:

v_my_sqr(np.array([0, 1, 2, 3, 4]))
calculating my_sqr of x = 0
calculating my_sqr of x = 1
calculating my_sqr of x = 2
calculating my_sqr of x = 3
calculating my_sqr of x = 4
array([ 0,  1,  4,  9, 16])

Applying a vectorized function to quib arguments creates a vectorized function quib

In analogy to the standard behavior above, applying a vectorized function to quib arguments creates a vectorized function quib that calculates its output by calling the underlying function on each element of the output of its quib arguments. As with other function quibs, this definion is declarative (lazy by default), so no calculations are initially performed:

x = iquib(np.array([0, 1, 2, 3, 4]))
x_sqr = v_my_sqr(x).setp(cache_mode='on')

Calculations are only performed once we request the output of the function quib:

x_sqr.get_value()
calculating my_sqr of x = 0
calculating my_sqr of x = 1
calculating my_sqr of x = 2
calculating my_sqr of x = 3
calculating my_sqr of x = 4
array([ 0,  1,  4,  9, 16])

Vectorized quibs independently calculate and cache specifically requested array elements

As the output of vectorized function quibs is calculated element-by-element, there is no need to calculate the entire array if only specific elements are requested. Indeed, an np.vectorize quib knows to only calculate the array elements specifically needed to provide a requested output.

For example, let’s repeat the simple code above, but only ask for the value of x_sqr at a specific element. Quibbler will only evaluate the function at the requested position:

x = iquib(np.array([0, 1, 2, 3, 4]))
x_sqr = v_my_sqr(x).setp(cache_mode='on')
x_sqr[3].get_value()
calculating my_sqr of x = 3
9

These calculated values resulting from each call to the underlying fucntion are indepdnently cached, so further requests for array output only calculate the parts of the array not yet calculated:

x_sqr[2:].get_value()
calculating my_sqr of x = 2
calculating my_sqr of x = 4
array([ 4,  9, 16])
x_sqr.get_value()
calculating my_sqr of x = 0
calculating my_sqr of x = 1
array([ 0,  1,  4,  9, 16])

Vectorized quibs track validity of individual array elements

Not only array elements of vectorized function quibs are individually calculated and cached, their validity is also independently tracked upon upstream changes.

When upstream value changes, such changes only invalidate the specifically affected array elements. Only the calculation of these elements is then repeated when the output is requested:

x[3] = 10
x_sqr.get_value()
calculating my_sqr of x = 10
array([  0,   1,   4, 100,  16])

Using vectorize for graphic functions

Vectorized function quibs readily facilitate creating multiple instances of similar graphic elements. This is done simply by vectorizing an underlying function that create graphics and setting Quib.is_graphics=True in the vectorize command.

Here is a simple example:

from functools import partial

# define graphics vectorize function
@partial(np.vectorize, is_graphics=True, signature='(),(2),(2),()->()')
def draw_arrow(ax, xy0, dxy, w):
    xy1 = xy0 + dxy
    ax.plot([xy0[0], xy1[0]], [xy0[1], xy1[1]], 'r-')
    phi = np.pi + np.arctan2(dxy[1], dxy[0])
    phi1 = phi - 0.3
    phi2 = phi + 0.3
    ax.plot([xy1[0], xy1[0] + w*np.cos(phi1)], [xy1[1], xy1[1] + w*np.sin(phi1)], 'r')
    ax.plot([xy1[0], xy1[0] + w*np.cos(phi2)], [xy1[1], xy1[1] + w*np.sin(phi2)], 'r')

# prepare figure
plt.figure()
ax = plt.gca()
ax.axis('square')
ax.axis([0, 50, 0, 50])

# define quibs:
xy = iquib(np.array([[10, 10], [20, 20], [30, 30], [40, 40]]))
xy_tail = xy[0:-1]
xy_head = xy[1:]
dxy = xy_head - xy_tail
w = iquib(4.)

# draw
draw_arrow(ax, xy_tail, dxy, w);
plt.plot(xy[:,0], xy[:,1], 'ob', markersize=4, picker=True);
_images/Divergence_arrows.gif

Passing quibs as arguments to allows inverse assignment from vectorized quibs

In the examples above, when the vectorized function quib gets quib arguments it sends to the underlying function the output value of these quibs at given array positions. The underlying function deals with regular, non-quib, arguments. Alternatively, it is also possible to send the underlying function quib arguments which reference the vectorize quib arguments at the corresponding indices. This behavior is controlled by the pass_quibs kwarg of np.vectorize. Setting pass_quibs=True will pass quib as arguments thus enabling some additional functionality including in particular the ability to inverse assign from graphics created within the function.

See this example:

from matplotlib.widgets import RectangleSelector, Slider

# Set figure:
plt.figure(figsize=(4, 5))
ax = plt.gca()
ax.axis('square')
ax.axis([0, 100, 0, 100])
ax_slider = plt.axes([0.2, 0.05, 0.6, 0.05])

# Define quibs:
number_of_rectangles = iquib(3, assignment_template=(1, 8))
ext_default = iquib(np.array([10, 20, 10, 20]))
exts = np.tile(ext_default, (number_of_rectangles, 1))
exts.setp(allow_overriding=True, assigned_quibs='self')

# Use vectorize with pass_quibs to allow inverse_assignment:
@partial(np.vectorize, signature='(4)->()',
         is_graphics=True, pass_quibs=True)
def rectangle_selector(ext):
    RectangleSelector(ax=ax, extents=ext)
    return

# Graphics:
rectangle_selector(exts)
ax.text(5, 95, q(str, exts), va='top');
Slider(ax=ax_slider, label='n', valmin=1, valmax=8,
       valinit=number_of_rectangles);
_images/Divergence_passquibs.gif

Additional demos

For additional examples, see: