Stefano Meschiari

July 29, 2011
by stefano
0 comments

You gotta get a gimmick: BAM

BAM is a new, extremely lightweight templating system that converts files containing a mixture of code, metadata and text and produces a plain text file. We use a fully-featured version of BAM as an “automated paper writer” in my research group at UC Santa Cruz. A short guide to BAM is available here.


BAM was born out of my conversations with Greg Laughlin on how to automate the somewhat repetitive task of writing certain types of scientific papers and reports. Since we already had a planet discovery tool at our disposal, we decided to focus on planet discovery papers. In its full version, BAM is deeply integrated with the Systemic Console to produce first drafts of planet discovery papers from a reusable template, with the aim of automating as much as possible the writing of the abstract, introduction and quantitative analysis. The Console-integrated BAM has some cool NLG features that we used to write Meschiari et al., 2011 (discovery of HD31253b, HD218566b, HD177830c, HD99492c). Since these planets are somewhat unremarkable, this paper lent itself well to automated writing of the majority of the text. Ideally, BAM templates evolve to contain enough logic to be able to “comment” on the data it processes; for instance, it could write a few remarks about the distribution of the newly discovered planets in period-eccentricity and period-mass plots, such as the one to the right (where the dots are automatically downloaded from the Extrasolar planet encyclopaedia and placed on the plot by BAM, as well).

Following the spirit of literate programming, the paper template contained the procedure for the data analysis, fitting, error estimation and all the plots intermingled with the LaTeX layout and fragments of text. I presented this tool at the European Science Foundation conference (presentation in Keynote format), including a somewhat humorous video of the Console discovering, analyzing and publishing four planetary systems in real time based on the directions of the paper template (see video at Greg’s website). It’s become one of my favorite gimmicky (but useful) tools to show off to people.

While we keep the Console-integrated BAM version private for use by our team, I am making a rewritten and simplified version, available to download now. You should consider this a preliminary 0.1 version and expect bugs and limitations to crop up; features will be added back in future revisions.

July 4, 2011
by stefano
0 comments

A new home

My name is Stefano Meschiari. I’m a graduate student in astrophysics by day, and an avid programmer when time allows. I’ve dabbled in many kinds of software projects, including programs to discover exoplanets…

 

The systemic console

…a desktop program to attempt to bring simple numerical calculations to the masses (Solution)…

Solution: a wannabe Matlab

and many other small programs, that I’m just now beginning to collect on a unified website. I’m also interested in user interface design and usability.

For a period of time, I used to blog on Tumblr at Sweet, Sweet Cocoa as I was learning the ropes of programming on Mac OS X using Cocoa. Unfortunately (or not, depending on my mood), my research as a grad student kept taking over the time I tried to dedicate to both keeping the blog going, and learning the needed skills to port Solution from a well-skinned Java application to a proper Mac OS X citizen. The hope was to bypass the many bugs and restrictions imposed by Apple’s JVM, and maybe even publish Solution on the Mac App Store.

Due to various time constraints, it didn’t pan out in the end, so I have decided to fix some of the remaining bugs and remove the trial time restriction on Solution. Today I published Solution 1.0 (available for download, and now compatible with Mac OS X Lion!) and make it available for free indefinitely. There are still some outstanding bugs and I only have a limited amount of time to dedicate to advance it given that the Java version is definitely a dead-end.

The pipe dream now is to start coding up an iPad version. It would be perfectly adapted to the form factor and portability of the iPad, and given that Apple recently lifted many restrictions on the usage of interpreters (there’s even a full Lua interpreter in the store right now!) I could potentially port the core of Solution with ease.

In the meantime, I will blog about new code projects, (hopefully) interesting programming challenges I encounter, and other techy stuff, with the occasional astronomy stuff thrown into the mix (though there are many awesome astronomy blogs out there: you should visit them!).

Older posts from my previous blogs are underneath.

December 23, 2010
by stefano
0 comments

Simple primitives container in Objective-C

Today’s post deals with a small pet-peeve I have with ObjC so far: using NSArray to store primitives (wrapped and unwrapped via NSNumbers) is rather cumbersome. Using malloc‘ed arrays is also inconvenient to pass around between objects, since they are not reference-counted and don’t carry their length with them.

After reading this article about emulating templates with C macros, I implemented a small set of fixed-length container classes for primitives, with a template-like set of macros.

The macros were used to generate containers for all primitives (short, int, long, long long, char and unsigned variants, BOOL, float, double) in immutable and mutable flavors. The containers can return the primitive itself, or an autoreleased NSNumber instance that wraps the primitive. The interface for PCMutableDoubleArray, for instance, looks like this:

// Immutable interface
- (id) initWithLength: (unsigned int) num; 
- (id) initWithLength: (unsigned int) num andDoubles: (double) double_0, ...;
- (double) doubleAtIndex: (unsigned int) index; 
- (double*) array;
- (unsigned int) count;
- (unsigned int) length;
- (NSNumber*) numberAtIndex: (unsigned int) index;

// Mutable interface
- (void) setNumber: (NSNumber*) number atIndex: (unsigned int) index;
- (void) setDouble: (type) aDouble atIndex: (unsigned int) index;
- (void) setDoubles: (int) length, ...;
- (void) setDoublesFromArray: (PCDoubleArray*) arr;

A .zip archive of the containers is available here (BSD licensed).

December 16, 2010
by stefano
0 comments

Data visualization in Cocoa

Probably one of the best advantages of an interactive GUI for scientific programs is the ability to visualize data in real time. While you could as well output a full dump of your data and plot with an external program, there is great pleasure in exploring how your model reacts to mucking and futzing around with the parameter space.

Image and video hosting by TinyPic

In the Systemic Console program (above), I wrote a flexible Swing component for plotting. As you can see, it takes care of plotting scatter (with optional error bars) and line plots, histograms and 3D line plots. Plot styles are easy to customize and are close to most journal styles by default. The component takes care of printing, zooming, tracking the mouse pointer and has a customization panel. Writing this component myself and optimizing it for quick redrawing took quite a bit of time. The graphical performance of Swing under OS X was often under par, especially when trying to get antialiasing right. There is no built-in way to generate PDFs (I ended up using the excellent iText for that). And finally, as usual a lot of the tedium derived from Java’s verbosity. What’s available for Cocoa?

Plotting with Cocoa

I’m still quite a way to be worrying about the UI of my new application; currently, I am mostly thinking about the overall framework, how much of it to write in C and how much to wrap in shiny Objective-C classes. However, I’ve spent a bit of time researching the current landscape of data visualization. I will survey some of the options below, along with some IMO pros/cons.

1. SM2DGraphView

Website, Documentation, Open source
Image and video hosting by TinyPic
An open-source framework used by a few Mac applications. It can show line and scatter plots, and pie charts. It is very well documented and easy to use (see e.g. this tutorial at MacResearch).

  • PROS: Easy to use, well documented, open source
  • CONS: Plots look outdated, difficult to get output in line with journal standards, low quality output compared to other options

2. CorePlot

Website, BSD license
Image and video hosting by TinyPic
I have only had time to give a cursory look to this framework, but from what I have seen from glancing at the examples and the documentation, it seems like a very complete and decently documented framework. The project is frequently updated, and there seems to be a good amount of tutorial and forum posts on Stack Overflow about it. This is mainly due to its compatibility with iOS.

  • PROS: Complete documentation (including a .docset for Xcode), plots look good on screen
  • CONS: None, but it seems more complicated than other options.

3. DataGraph framework

Website, Free for open source projects, $400 otherwise (including in-house projects)
Image and video hosting by TinyPic

This is basically the full plotting component powering the excellent DataGraph. The framework is available for free for open source projects.

The philosophy is very different from the other components listed above. A plot is defined beforehand by creating a template in DataGraph, rather than programmatically in the code; this is similar to how Aqua UIs are usually serialized in nib files designed with Interface Builder rather than created programmatically. This makes setting up a static plot almost trivial. You can also connect sliders, color wells and other interface elements with plot and formula parameters extremely easily. Zooming, panning and exporting are taken care by the framework. Finally, the output quality is the same as DataGraph, and therefore excellent.

  • PROS: Quality is awesome, free for open source projects, trivial to get up and running with a pre-made DataGraph template
  • CONS: As far as I can tell, you can’t add new plots programmatically, so you might have to allow for several plots in the template and dynamically show/hide them. Documentation is very scarce.

4. A custom NSView

This might be easier than it looks, and is basically the path I took for the Systemic Console to get exactly the look and functionality I needed. It has the advantage of implementing exactly what you need without unwanted baggage.

An object inheriting from NSView will do its drawing in the drawRect: method. The actual drawing can be performed using NSBezierPaths to draw lines, ovals, rects and rounded rects with quality antialiasing. Finally, PDF creation is handled by Cocoa automatically (see, e.g., this).

  • PROS: Completely customizable presentation, PDF export for free, probably the best solution if only handling a few uncomplicated plot types.
  • CONS: Writing everything yourself, meaning time-consuming to get right. NSBezierPath not available on iOS.

Conclusions

This is just a quick and undoubtedly incomplete survey of options for plotting with a Cocoa view. For my project, I will probably go with #3 or #4, with Core Plot a close third and the best option for iOS development. The DataGraph framework is truly impressive for output quality and breadth of options, and trivially easy to set up for static plots.

Please let me know if there’s any library I missed in this post. You can leave a comment below or send me a message with the “Ask anything” link at the top of the blog.

December 3, 2010
by stefano
0 comments

Pay No Attention to the Runtime Behind the Curtain

Pay No Attention to the Runtime Behind the Curtain: You can get a good idea of the complexity of the Objective-C runtime behind the scenes by using clang with the -rewrite-objc option. This simple line to allocate and initialize an NSArray with a few NSString literals

keysArray = [[NSArray alloc] initWithObjects:@"period", @"mass", @"meanAnomaly", @"eccentricity", @"longPeri", nil];

would become something like this ungodly mess, if there were a separate “compile to C” step:

keysArray = ((id (*)(id, SEL, id, ...))(void *)objc_msgSend)((id)((id (*)(id, SEL))(void *)objc_msgSend)
	(objc_getClass("NSArray"), sel_registerName("alloc")), sel_registerName("initWithObjects:"), 
	(id)(NSString *)&__NSConstantStringImpl_OKOrbitalFit_m_0, (NSString *)&__NSConstantStringImpl_OKOrbitalFit_m_1, 
	(NSString *)&__NSConstantStringImpl_OKOrbitalFit_m_2, (NSString *)&__NSConstantStringImpl_OKOrbitalFit_m_3, 
	(NSString *)&__NSConstantStringImpl_OKOrbitalFit_m_4, ((void *)0));

Enough to bring down cdecl.

December 1, 2010
by stefano
0 comments

Using NSOperation, NSOperationQueue and math frameworks

As I mentioned in my introductory post, I am working on the overall design of a native Cocoa port of the Systemic Console (currently a Java/Swing application). This involves carefully thinking about what some of the core needs will be and which of the core libraries (frameworks) will address them. Therefore, I will try to periodically talk about what class/library I am learning for a particular section of the port, with some useful links I have perused and possibly some small snippets of code. This will also be useful to me to store my resources in a centralized place.

The Java implementation

The lion’s share of the code resides in a “kernel” class (which delegates to a decent number of focused classes), and is dedicated to complex and not-so-complex numerical tasks, often executed in parallel. A good number of these are implemented using custom numerical routines, ThreadPoolExecutor and Apache Commons-Math.

Numerical libraries

Objective-C’s ability to interface trivially with C libraries opens up a wealth of options to choose from. GSL, for instance, has been conveniently packaged as a framework here. Mac OS X comes with the Accelerate framework preinstalled, which includes LAPACK, BLAS and a number of vectorizing operations calls. The documentation is a bit sparse for some of it; the Apple-provided PDF doc and this nice article were very helpful. SuperMegaUltraGroovy has a simple ObjC wrapper for it (SMUGMath), which I’m testing out now – seems nice so far.

Threading

I’m dipping my toes into the threading facilities provided by Cocoa, trying to do a bunch of numerical stuff in a test Foundation project. So far, it looks pretty easy. Specifically, I am using NSOperationQueue (implemented with Grand Central Dispatch in 10.6) to submit a bunch of numerical jobs.

NSOperationQueue is supposed to figure out the best number of threads to be spawned for the available hardware, and reusing threads as possible. Each task in my test code is implemented with NSInvocationOperation, which invokes the specified @selector, and stuffed in a NSArray. Finally, the jobs are queued and executed using -addOperations:waitUntilFinished: by setting the last parameter to YES, the current thread is blocked until all operations have finished running. Setting up NSInvocationOperations was a cinch. So far so good!