Systemic wins LIFT award, new version of Systemic (2.17)

I am very happy to announce that Systemic has been awarded the LIFT grant. This means that, together with my colleagues Joel Green and Randi Ludwig, I will be able to work full-time — instead of as a side project — on improving and expanding Systemic, and creating new educational apps like Systemic Live and Super Planet Crash, over the next year!

I am also releasing a new release of Systemic 2 (2.17) which addresses some bugs and improves the documentation for installation on Linux. You can download it now.

Below are some of the changes:

– NEW: ktable function for listing the fit values as a table, suitable for exporting to TeX or HTML
– NEW: Bayesian Information Criterion menu item
– NEW: Quadratic trend term
– CHANGED: Periodograms report the normalized power between 0 < p < 1, where power = 1 is a perfect fit. — This is a bug corrected in 2.172.
– FIXED: bug where the semi-amplitude would be calculated incorrectly for massive bodies. (credit: Trifon Trifonov)
– FIXED: bug in simplex that would crash the application if the minimizer encountered a NaN value.
– FIXED: bug in the GUI that would crash the application in case of excessive text output.
– FIXED: Fixed naming of columns in the matrix returned by kdata().
– FIXED: bug where the radial velocity curve or periodogram would look excessively jagged.
– FIXED: bug in kperiodogram.boot where the function could crash.
– FIXED: bug in kperiodogram.boot where the function would only calculate the ‘full’ periodogram (instead of the periodogram of residuals) for certain inputs.
– FIXED: bug in the GUI periodogram routine, where you could receive an error for certain power spectra.
– FIXED: Kernel plot using plot() respects the chosen xlim.
– FIXED: MCMC would crash in certain situations when set up from the menu.
– FIXED: 1LO-crossval returns the signed sum of logs, instead of the absolute value.
– FIXED: clarified the installation instructions (Readme.txt) for Linux. (credit: Franz Feldtkeller)
– FIXED: you can now choose a path for R that is not /usr/bin/R by selecting Help -> Set path to R…
– FIXED: F-test menu item uses the current kernel instead of the kernel named “k”.
– Various bug-fixes in the plotting routines.

Bugs reported at are unfortunately still open due to lack of time to address them. They will be fixed in 2.18.

New release of Systemic 2 (2.14)

I have just released a new version of Systemic 2 (2.14).  Lots of little bug fixes, together with a few features.

My favorite is the “smooth orbit plot”, which recreates plots that will appear on an upcoming planet paper (I will put proper credit here once the paper is out!). The routine takes samples of orbital elements from the output of a Markov-Chain Monte Carlo or bootstrap run, and plots the orbits with some transparency, so that it is visually evident where orbits “crowd up”.

Smooth orbit plot
Smooth orbit plot made with Systemic 2.14

On top of the samples, I plotted the “best-fit” orbit in red (which hopefully will fall on top of the range of possible orbital elements!). This plot can also give some visual sense of multi-modal or non-symmetrical element distributions.

Download Systemic 2.14

Other changes:
– Added a feature for smoother/faster plotting in GUI
– Periodograms now print out the strongest peaks
– Periodograms are zoomable
– Fixed Cross-validation for fits with only one planet
– New menu item to add random Gaussian noise to data
– Smooth orbits plot from an error estimation object
– FIXED: Improved Linux installation instructions (credit for reporting: Thomas Kosvic)
– FIXED: bug in SWIFTRMVS on 32-bit installations (credit for reporting: Thomas Kosvic)

Systemic 2 Cookbook: Markov-Chain Monte Carlo

This post is the second in the “Systemic Cookbook” series, which will show how to use Systemic 2 for scientific purposes (especially its more advanced features not readily accessible from its user interface). More comprehensive documentation for Systemic is forthcoming.

All posts in this series will be collected in the “Systemic 2″ category.

I wrote a short draft guide to doing Markov-Chain Monte Carlo-type of fitting and error analysis with Systemic2. This guide covers the very basics of the algorithm, the input parameters, and a step-by-step guide on how to drive the algorithm from the user interface, or from the C library. Please note that more sophisticated MCMC algorithms are available as R packages on CRAN.

→ Download PDF

Systemic 2 Cookbook: Parallel computation with R

This post is the first in the “Systemic Cookbook” series, which will show how to use Systemic 2 for scientific purposes (especially its more advanced features not readily accessible from its user interface). More comprehensive documentation for Systemic is forthcoming.

All posts in this series will be collected in the “Systemic 2″ category.

Systemic 2 automatically parallelizes so-called “embarrassingly parallelizable” workloads that do not need shared data or synchronization. Such workloads include bootstrap resampling and periodograms (each resampled data set is analyzed independently), Markov Chain Monte Carlo (each chain runs in parallel, and only synchronizes at infrequent intervals), jackknife, and other algorithms. Some algorithms can even be launched on Xgrid-connected clusters (Why, Apple?!).

The majority of embarrassingly parallelizable workloads, however, will be better defined by the user. In the rest of this article, I will show how to easily parallelize dynamical integration over a grid using an R script. (The following problem could be just as easily set up using C, but I will be showing how to accomplish it in R since the code will be most transparent in a high-level language.)

Setting up dynamical integration over a grid

The (somewhat artificial) problem is as follows. Imagine that we would like to investigate the survival time of a two-planet system, as we increase the eccentricity of the outer planet. Intuitively, we expect the system to become unstable faster as the eccentricity increases.

Let’s first set up a sample system and integrate it forward in time with Systemic2.

To run this script, type

Rscript scriptname.r

from the shell (edit the script beforehand to point it to the correct path to Systemic).

This script sets up a two-planet system around a 0.95 solar mass star, integrate it forward 50,000 years using the SWIFT backend and plot it to a PDF file. The “k” object is called a kernel within Systemic; it contains all the data and functions associated with the planetary system (internally, it is a handle returned by the C library).integration_1e5 Next, we create a grid of eccentricity and longitude of periastron angles. The rationale here is to average the survival times for each eccentricity, over the longitude of pericenter in order to remove the dependency on the phase angle (again, this is an artificial exercise!).

This part of the script involve some idiomatic R functions (not too dissimilar from the strategies used by Matlab, Numpy and other vectorized languages). To create a grid of eccentricity/phase angles (a cartesian product), I used the expand.grid function. Subsequently, I loop over the (e, lop) pairs using mapply. Mapply (m-apply) applies a function over each element of the two vectors and collects the return values of the function.

The function called by mapply sets up a kernel with the given eccentricity and longitude angle, and integrates the system forward in time using kintegrate. Kintegrate is the Systemic function that integrates a system forward in time,  returning an object containing information about the dynamical evolution of the system. One of the fields of the object is “survival.time”, how long the system survives (in this case, whether there is a close encounter or an ejection of one of the bodies).  Then, we collect the survival times, group them by eccentricity (using split) and take the median (using sapply).  Finally, we plot the eccentricity vs. the survival time.

Parallelizing the problem

Recent R builds include the parallel package. The parallel package includes various implementations of apply (map) functions that are drop-in replacements for the builtin R functions (e.g. the *apply family of functions). Parallelization is achieved by transparently launching a number of “worker” processes, executing the function on each worker and finally collecting the result (the workers are heavyweight processes rather than threads, so you will see several R processes if you inspect your running programs).

Screen Shot 2014-01-16 at 8.36.07 PM

Here is the revised script:

Note that the only change was using the “mcmapply” (multicore-mapply) function instead of the “mapply” builtin! Since this problem is eminently parallelizable, the speedup is linear in the number of physical cores.

Closing words

In this case, parallelization was trivial; in other cases, it might be more involved (for instance, when balancing is required).