MATLAB 2016a Release Summary for Scientific Computing

There is a lot to read every time MATLAB releases a new version. Here is a summary of what has changed in 2016a from the eyes of someone doing HPC/Scientific Computing/Numerical Analysis. This means I will leave off a lot, and you should check it out yourself but if you’re using MATLAB for science then this may cover most of the things you care about.

  1. Support for sparse matrices on the GPU. A nice addition is sprand and pcg (Preconditioned Conjugate Gradient solvers) for sprase GPU matrices.
  2. One other big change in the parallel computing toolbox is you can now set nonlinear solvers to estimate gradients and Jacobians in parallel. This should be a nice boost to the MATLAB optimization toolbox.
  3. In the statistics and machine learning toolbox, they added some algorithms for high dimensional data and now let you run kmeans … READ MORE

Interfacing with a Xeon Phi via Julia

(Disclaimer: This is not a full-Julia solution for using the Phi, and instead is a tutorial on how to link OpenMP/C code for the Xeon Phi to Julia. There may be a future update where some of these functions are specified in Julia, and Intel’s compilertools.jl looks like a viable solution, but for now it’s not possible.)

Intel’s Xeon Phi has a lot of appeal. It’s an instant cluster in your computer, right? It turns out it’s not quite that easy. For one, the installation process itself is quite tricky, and the device has stringent requirements for motherboard choices. Also, making out at over a taraflop is good, but not quite as high as NVIDIA’s GPU acceleration cards.

However, there are a few big reasons why I think our interest in the Xeon Phi should be renewed. For one, Intel … READ MORE

Simple Parallel Optimization in Mathematica

February 2 2016 in Mathematica | Tags: , | Author: Christopher Rackauckas

A quick search on Google did not get hits for a standard method of parallelizing NMaximize and NMinimize, so I wanted to share how I did it.

My implementation is a simple use of a Map-Reduce type of parallelism. What this means is that we map the same problem out to N many processes, and when they finish they each give one result, and we apply a reduction function to reduce the N results to 1. So what we will do is map the NMaximize function to each of N cores, where they will each solve it on a random seed. From there, each process will return what it found as the minimum, and we will take the minimum of these minimums as our best estimate of the global minimums.

Notice that this is not optimal in all cases: for … READ MORE