Mr. Graph

Welcome to Mr. Graph!

Mr. Graph is a python library designed to make composing graphs of sync and async functions easy!

Installation

From PyPi

latest release:

pip install mr-graph

From source

Main branch using git+https.:

pip install git+https://github.com/mcminis1/mr-graph.git

Concepts

Mr. Graph consists of 3 kinds of entities:

  1. Graphs

  2. Nodes

  3. NodeDataClass

Graphs

A Graph is an entity that works like a function. To build a graph you must add functions to it. Once you’ve added functions to it, they are converted to nodes and can be organized into an execution graph. If you’ve chosen to name your inputs and outputs according to convention (as described in the tutorial), you may not need to organize the graph before executing.

Nodes

A wrapper for a function that’s used to track the input and outputs from a function as it’s called throughout the graph.

The convention used throughout Mr. Graph is that you can use google docstrings to name your outputs from functions. This allows them to be tracked and ordering functions can be done by just looking at input and output names and types.

NodeDataClass

The dataclasses used as input and output from the functions in the graph.

NodeDataAggregator

A class that allows you to aggregate results from nodes in the graph to fan-in to downstream nodes.

Tutorial

In these short sections you’ll learn how to build and execute graphs using Mr. Graph.

Defining Functions

You can use blocking or async function in Mr. Graph. If you are already using google style docstrings all you need to change to take advantage of Mr. Graph is naming your outputs.

A blocking example with no inputs.

def return_one():
    """
    return 1

    return one with no inputs

    Returns
    -------
    m : float
        equal to 1
    """
    return 1

An async example that has an input and output.:

async def add_1(n: float):
    """
    add 1 to a number

    add 1 to the input value n.

    Parameters
    ----------
    n : float
        number to add 1 to.

    Returns
    -------
    m : float
        equal to n + 1
    """
    return n + 1

Implicit Graph Definition

As demonstrated in the root quickstart, you can wire up graphs implicitly

async def build_implicit_linear_graph():
   """
   Return 2

   Simple example linear graph. Wired up automagically using the input and output variable names.

   Parameters
   ----------

   Returns
   -------
   two : float
      always equal to 2
   """

   g = Graph(nodes=[return_one, add_1])
   return await g()

This method will return a dataclass with an attribute named ‘m’ (determined by the output annotation on the add_1 method).

Explicit Graph Definition

For more complex graphs it’s sometimes required to wire them up manually.

async def build_explicit_linear_graph():
   """
   Return 2

   Simple example linear graph. Wired up manually.

   Parameters
   ----------

   Returns
   -------
   two : float
      always equal to 2
   """

   g = Graph(nodes=[return_one, add_1])
   o_1 = g.return_one()
   g.outputs = g.sub_1(o_1)
   return await g()

In this case the return from build_explicit_linear_graph is the same as previously demonstrated in the implicit example; a dataclass with a single attribute named ‘m’. If you inspect g.outputs you’ll find that it’s a dataclass with a single attribute named ‘m’ and is equal to None until the graph is executed.

Defining Graph Inputs

We can define an input for the graph, and then pass in a value to it when executing it.

async def pass_input_to method(input_val: int):
   """
   Return input_val + 1

   Simple example linear graph. pass in value and get value + 1 back.

   Parameters
   ----------

   Returns
   -------
   m : float
      equal to input_val + 1
   """

   g = Graph(nodes=[add_1])
   i_0 = g.input(name="n")
   g.outputs = g.add_1(i_0)
   return await g(n=input_val)

Alternatively, in this case, there would be no ambiguity if you passed in the input_val as an arg instead of a keyword arg.

async def pass_input_to method(input_val: int):
 ...
   return await g(input_val)

In general, Mr. Graph tries to wire things up using names. However, when it’s unambiguous, it is possible to rely on ordering.

Multiple Outputs

Sometimes you need to return multiple values from a graph.

async def sub_1(m: int):
    """
    subtract 1 from a number

    subtract 1 from the input value m.

    Parameters
    ----------
    m : float
        number to subtract 1 to.

    Returns
    -------
    p : float
        equal to n - 1
    """
    return m - 1

def mult_2(p: float):
    """
    multiply a number by 2

    returns p*2

    Parameters
    ----------
    p : float
        number to multiply by 2

    Returns
    -------
    q : float
        equal to p * 2
    """
    return 2 * p

async def fan_out_graph(input_val: int):
    # fan out example
    g = Graph()
    g.add_nodes([sub_1, add_1, mult_2])

    i_0 = g.input(name="n")
    o_1 = g.add_1(i_0)
    g.outputs = g.mult_2(o_1)
    g.outputs += g.sub_1(o_1)

    return await g(n=input_val)

This will return a dataclass with two attributes: p and q.:

v = await fan_out_graph(2)
assert v.q == 4
assert v.p == 1

When returning multiple values they are combined into a single dataclass object and returned. If there are conflicting names on the dataclasses then it will raise an error.

Aggregating results

Sometimes its useful to aggregate results from many different nodes into a list to pass to a function (fan-in architecture). There is a special class member that allows you to build those lists called an aggregator

llm = Graph(nodes=[get_structured_answer, summarize_answers])

answers = llm.aggregator(name="answers")
for question in questions:
    sa = llm.get_structured_answer(user_question=question)
    answers += sa.answer
llm.outputs = llm.summarize_answers(answers=answers)

v = await llm(answers)
return v.summary

In this example a list of answers is aggregated and used as an input to another function.

Philosophy

The purpose of Mr. Graph is to provide an efficient graph processing library for I/O bound functions. All blocking calls are ran in their own threads, and are executed concurrently with async functions. As these functions return their results, downstream functions are executed when their dependencies are done being computed.

Return values are always wrapped in dataclasses whose names and types are specified using docstrings on the functions.

This project might be useful for you if you have a graph of API calls to make which depend on each other.

Features

Mr. Graph is new and under active development. Current features include:

  • Use with either async or sync functions

  • Uses regular documentation formats to name return values.

  • Can infer pipelines from input and output signatures

  • All directed acyclic graph layouts supported. linear, fan-in, fan-out.

If you’re interested in contributing, please create a ticket on github and suggest a feature!

Example Usage

Building graphs can be as easy as:

from mr_graph import Graph

def return_one():
   """
   return 1

   return one with no inputs

   Returns
   -------
   m : float
      equal to 1
   """
   return 1

async def add_1(m: float):
   """
   add 1 to a number

   add 1 to the input value m.

   Parameters
   ----------
   n : float
      number to add 1 to.

   Returns
   -------
   n : float
      equal to n + 1
   """
   return n + 1

async def build_implicit_linear_graph():
   """
   Return 2

   Simple example linear graph. Wired up automagically using the input and output variable names.

   Parameters
   ----------

   Returns
   -------
   two : float
      always equal to 2
   """

   g = Graph(nodes=[return_one, add_1])
   return await g()

Indices and tables