Program to extract data from a graph

I find myself in possession of a printout of a graph, from which I need to reconstruct the data that was used in making the graph (unfortunately, my advisor never bothered to save the data itself). I’ve already scanned in the graph and cleaned up the image so that the lines can be distinguished (from each other and from the assorted scribbles that were marked on the graph), but now I need something that’ll read those lines and output data in some usable format, such as an ASCII table. I found a program online called DigitizeIt that seemed to work, but it’s crippleware, and won’t let me save the output, or copy it to the clipboard, unless I pay $50. Everything else I found only works on Windows or Linux (I’m on OSX). Any suggestions?

Manual analysis and data entry?

Is this a continuous graph, or discrete? If it’s continuous, you could pick a reasonable sampling density (depending on curved the curve is, etc), such as 5mm, and measure the height from zero and at the end of each interval, using a ruler, then type in the readings you get.

Hire someone to do it?

A friend of mine was raving about how awesome one of these programs was last year. I’m not positive but GetData sounds familiar. If that’s not up to snuff, googling “digitizing data from graph” will give you a dozen or so freeware/shareware options.

I’ve used Engauge on Windows and Linux. The linked version for OSX is slightly dated, but it should work for your needs.

They’re using Engauge at my Uni as well.

Back in the 90s I wrote a piece of software to do this sort of thing, interfaced to a 18" digitizer tablet. With a cross-hair puck it was really accurate at extracting data. It could even cope with skewed axes and multiple datasets. It was a pretty cool bit of software, but the problem was that the printed graphs we started on were not big enough to give any real resolution, even blown up on a photocopier. It was good for graph recreations, though. I guess it is long gone now.

Si

Do PPC apps still work on Intel Macs?

Good question…one that I am not equipped to answer. Here’s the source that does require some dependencies if you choose go that route. Not sure if there is a precompiled version for Intel Mac OSX out there. Sorry if I was a bit misleading, but I the OP didn’t specify hardware and I’m not a regular Mac user.

Probably not what you’re looking for, but it would be straightforward to write your own simple software to do this. Once you’ve cleaned up the image and stored it in a simple format like PPM, you can locate pixels of desired color with something as simple as grep, and develop (x,y) coordinates easily from pixel location.

Depending on your skill set, this could be simpler and more fun than tracking down pre-written software to do the same thing.

Actually, I should have just checked for myself. I had heard that Rosetta (which runs PPC apps on Intel Macs) was not included in Mac OS X Snow Leopard. But it seems it’s just not installed by default. So a PPC version should still work.

OK, maybe I’m just dense, but I downloaded Engauge, and I’m seeing all sorts of documentation, auxiliary files, readmes, and so on, but no executable or app. The instructions tell me to untar the executable release package, and then to look for engauge.app , but I can find neither an executable release package nor an app.

NIH Image used to do the sort of thing you’re looking for.
Looks like it’ll run in System 9 emulation under OSX, if you can manage that w the Mac you’ve got, which I doubt.

The Faq recommends ImageJ, which runs native on Modern, Intel Macs: Download

I’ve never tried it.

I used a couple of the digitizing packages but was never entirely satisfied with the result (when dealing with spiky data), so I’ve written scripts in Matlab a couple of times to import an image and scan it for data. This is like fifty or sixty lines of code itself, although I’ve done it as part of a larger application to filter and analyze the data. You should be able to do the same in Mathematica, Octave, Scilab, SciPy, sagemath, or whatever scientific computing scripting tool you prefer; anything with image processing and numerical function capabilities.

Stranger

If it’s just one graph, you don’t need anything fancy. Load the image into Matlab and use ginput() to manually sample the data at whatever spacing you like. Use the same process to get the coordinates of the axes of the graph. Apply some linear interpolation to transform pixel coordinates into data coordinates.

“[x, y] = ginput” is okay if you just need to grab a few data points, but it is clunky if you have a large set of highly varying data, especially if it is oscillatory or noisy. In that case, you want to sample your data (and if the curve is thicker than a couple of pixels, the high and low) and then use an algorithm to find the local mean and filter out the garbage.

Stranger

I have not tried it, but DigitizeIt probably does what you want to do. There’s a 21 day free evaluation period in which you can try it for free.

DigitizeIt was the first one I tried. The free version is crippled into worthlessness: You can extract the values, but you can’t get them out of the application.

In case anyone’s still wondering, I ended up getting the Windows version of Engauge, and running it under Wine. It worked.

You know, I was going to suggest just running a Windows program in Wine. I just wasn’t looking forward to having to explain how to get Wine up and running. I shouldn’t have underestimated you. (Or maybe Wine if it’s gotten easier.)

Revisiting the original link that I posted, it looks like it’s just a source file and not a compiled app. Sorry about that. At the risk of again pointing you in the wrong direction, the Fink Project “wants to bring the full world of Unix Open Source software to Darwin and Mac OS X.” It looks like there are many great apps available, including engauge-digitizer. I’ll leave it to Mac experts to weigh in on the usefulness of fink, but it might be worth looking into if you anticipate running OSS that doesn’t have official Mac ports in the future.