A common task is to have a set of tasks; each with their declared dependencies. This setup exists in a variety of build tools such as make or rake
For instance lets consider this hypothetical Rakefile which defines some tasks.
task :a => [:b, :c]
task :b => [:d]
task :c
task :d
The goal would be to process the list of tasks in topological order; however do so in parallel.
A non-parallel topological walk may result in d -> b -> c -> a
which would take 4 units.
However if we were to optimize the processing it would be possible to do it
in [d, c] -> [b] -> [a]
; 3 units.
Topological sort is well understood1, however I was not able to find online a succinct code snippet for doing so in parallel.
# construct a directed acyclic graph representation
# an adjacency list structure is likely the simplest
graph = compute the directed acyclic graph
# create some form of executor where you can
# submit some jobs;
executor = construct an executor service
# we'll need a mutex to make some code safe
# unless the grap you are using is thread safe
mutex = construct mutex
mutex.synchronize do
# keep looping until the graph is empty
until graph.empty?
# sinks are nodes whose outdegree is 0;
# which means that there are no
# edges leaving the node
sinks = graph.sinks
sinks.each do |sink|
# we wrap the task so we can execute
# some end handling
executor.submit do
# run the task!
task.run
mutex.synchronize do
# remove this task from the graph
graph.remove_node sink
end
# wake up the loop thread
mutex.notify
end
end
# wait pauses the thread and releases the lock
# this is the optimal way to know when to wake up
mutex.wait
end
end
Enjoy some parallel task execution now.
-
Common interview question ↩