Simple Parallel Optimization in Mathematica
February 2 2016 in Mathematica | Tags: optimization, parallel | Author: Christopher Rackauckas
A quick search on Google did not get hits for a standard method of parallelizing NMaximize and NMinimize, so I wanted to share how I did it.
My implementation is a simple use of a Map-Reduce type of parallelism. What this means is that we map the same problem out to N many processes, and when they finish they each give one result, and we apply a reduction function to reduce the N results to 1. So what we will do is map the NMaximize function to each of N cores, where they will each solve it on a random seed. From there, each process will return what it found as the minimum, and we will take the minimum of these minimums as our best estimate of the global minimums.
Notice that this is not optimal in all cases: for example, the (adaptive) simulated annealing method in NMaximise utilizes the correlations between different members of “the population” to better pick steps. Here, each of the 16 simulated annealings will be independent, and so we will not do “16x better”, but there will still be a massive speedup.
For example, on my server with 16 cores, I calculate:
numTimes = 16; testList = Timing[ParallelTable[ NMinimize[ 100 (y - x^2)^2 + (1 - x)^2 + (1 - y)^4 + (2 x^4 - 3 y)^2, {x, y}, Method -> {"NelderMead", RandomSeed -> k}], {k, numTimes}]] {0.038197, {{0.942042, {x -> 0.530856, y -> 0.270048}}, {0.942042, {x -> 0.530856, y -> 0.270048}}, {0.942042, {x -> 0.530856, y -> 0.270048}}, {0.942042, {x -> 0.530856, y -> 0.270048}}, {0.942042, {x -> 0.530856, y -> 0.270048}}, {0.942042, {x -> 0.530856, y -> 0.270048}}, {0.0990871, {x -> 1.20902, y -> 1.45691}}, {0.942042, {x -> 0.530856, y -> 0.270048}}, {4.83384, {x -> -1.14862, y -> 1.30569}}, {0.942042, {x -> 0.530856, y -> 0.270048}}, {0.942042, {x -> 0.530856, y -> 0.270048}}, {0.942042, {x -> 0.530856, y -> 0.270048}}, {0.942042, {x -> 0.530856, y -> 0.270048}}, {0.942042, {x -> 0.530856, y -> 0.270048}}, {0.942042, {x -> 0.530856, y -> 0.270048}}, {0.942042, {x -> 0.530856, y -> 0.270048}}}}
Notice how this returned the results of 16 separate NMinimizes (using different random seeds), all as one list, in the time it takes to run 16. There was no need to setup parallel kernals, Mathematica setup 16 kernals (the number of CPUs in my computer) for me, and it took about the same time as one NMinimize. It just so happened that in the example I made there was exactly one configuration that hit a lower minimum than the others (I got pretty lucky!). To grab that, we order the list and then take the first item like so:
testList[[2]][[Ordering[testList[[2]][[All, 1]], 1]]] {{0.0990871, {x -> 1.20902, y -> 1.45691}}}
As mentioned before, this is not the “best” way to parallelize an optimization algorithm, but it’s a lot better than nothing. In this case, it found a much better result than I most likely would’ve gotten from 1 run, but in the same time as 1 run. Until Wolfram parallelizes their internal algorithm, this acts as a quick fix.