We have bar charts and assorted graphs embedded in Word documents. Because they are low-res Word pictures, to go into production (i.e., layout in Quark or InDesign), they have to be recreated. If there’s no source file available, it has to be done by hand. I believe the technical term for this is tedious pain in the ass.
Grasping at straws, is there anything out there that can take a simple graphic input and roughly translate it into a table of numbers?
I can’t imagine there’s a perfect solution, but given the number of files we work with on a daily basis, if there’s something out there that can help with even a small percentage of them the time savings would be enormous.
Is this for a business? That is what custom development is for but it costs real money either for in house resources or to hire a contractor to do.
This is the type of thing that I do but I don’t need freelance work at the moment. I can give you some advice though. Your job is to nail down the requirements better than you outlined in your OP before anyone or any package can help you.
In particular,
Are these all digital graphs or do you have some that are just paper as well.
Are they all roughly the same format or very similar (easiest) or all they very different graphs using all kinds of scales, values, and formats (hardest).
Can you outline a series of specific steps that you could follow from inputs to an acceptable output?
What do you want the final output to look like. Just numbers or a formatted graph with labels attached?
From a cost/benefit perspective, how important is each step of the automation to you? A dream package would do everything on a single click but that probably isn’t feasible. Where would you get the most gain? Sizing the graphs? Adjusting scales? Labeling? Final formating? Embedding it in another document? Each of those steps could probably be made more automated but it takes time and money to build a solution. Where is the fastest gain?
If they really are just Word pictures, this might be a very difficult problem. However, they might be a Word representation of numbers which would be much better. As a guess, MS Excel might be able to handle that because it is fairly straightforward to do custom programming in Excel and it can interface with Word and the other MS Office apps. An assortment of pure pictures of various types could be basically an exercise in why artificial intelligence is really hard.
Here is another person asking a similar question about recreating graphs from pictures it isn’t easy but it does reference one software package called Curvescan that I have no experience with.
You are looking for a data digitizer (Google finds a few). Some may be able to find the curve automatically. If the program lacks that capability, or the plot is too low-quality, you may have to do some mouse clicking along the graph so the program can “understand” where the curve is.
Thanks. Virtually everything we get is in Word format, and this comes into play when the task manager can’t get the underlying data from the author (typically because the author is working from other electronic media). When we can get source numbers, Illustrator makes short work of building things.
Automating this would probably save us a few hours per chart (the number of charts varies with the publication). Probably not enough to sub out the manual labor of typing in the data by hand, but if something can read a simple chart (some will still need to be done by hand) and pass an Excel/CSV file of approximate data, it would likely cut the time down to a half hour of polishing.