Parallel Task Execution

Published 2020-04-02 on Farid Zakaria's Blog

A common task is to have a set of tasks; each with their declared dependencies. This setup exists in a variety of build tools such as make or rake

For instance lets consider this hypothetical Rakefile which defines some tasks.

task :a => [:b, :c]

task :b => [:d]

task :c

task :d
G a a b b a->b c c a->c d d b->d

The goal would be to process the list of tasks in topological order; however do so in parallel.

A non-parallel topological walk may result in d -> b -> c -> a which would take 4 units.

However if we were to optimize the processing it would be posssible to do it in [d, c] -> [b] -> [a]; 3 units.

Topological sort is well understood1, however I was not able to find online a succint code snippet for doing so in parallel.

# construct a directed acyclic graph representation
# an adjacency list structure is likely the simplest
graph = compute the directed acyclic graph

# create some form of executor where you can
# submit some jobs;
executor = construct an executor service

# we'll need a mutex to make some code safe
# unless the grap you are using is thread safe
mutex = construct mutex

mutex.sychronize do
    # keep looping until the graph is empty
    until graph.empty?
        # sinks are nodes whose outdegree is 0;
        # which means that there are no
        # edges leaving the node
        sinks = graph.sinks
        sinks.each do |sink|
            # we wrap the task so we can execute
            # some end handling
            executor.submit do
                # run the task!
                mutex.sychronize do
                    # remove this task from the graph
                    graph.remove_node sink
                # wake up the loop thread
        # wait pauses the thread and releases the lock
        # this is the optimal way to know when to wake up

Enjoy some parallel task execution now.

  1. Common interview question