I’m gathering data online for an industrial product whose various manufacturers document performance with graphs having logarithmic axes. I’m then digitizing the graphed curves and doing further analysis with the numbers. But I discovered something that seems REALLY WEIRD. The graph axes in 8 out of 8 cases have huge inaccuracies. That is, the tick marks are very far out of place. They’ll be several percent off, even tens of pixels.
On a logarithmic axis, if I import the graph to a graphics editor such as Microsoft Paint and copy a little region through which the tick marks for 10 and 20 pass, the distance between those marks represents a factor of two. If I then paste that elsewhere in the image, like over the lines for 3 and 6 or the lines for 50 and 100, the tick marks and the lines should match because there is a factor of two for these lines also. Yet they don’t.
For one example I copied the 1 and 2 tick marks to place the 1 over a 50, and I found that the 2 was over the high 80s and didn’t even reach to the 90, let alone land on the 100.
I took all 8 graphs I downloaded from these manufacturer’s web sites and experimented on both horizontal and vertical axes, and did not find a single one that did not have errors of a few percent or more.
It is obvious that these are separate manufacturers graphing separate data sets, so this isn’t a case of one bad graph getting copied everywhere else. And, I don’t have independent versions of the curves themselves to compare (that’s why I’m digitizing graphs in the first place). At this point all I can tell is that every logged axis I grab online has enormous errors.
Is there some common computer oddity that could cause everybody who tries to publish their graphics to suffer distortions?
I remember, I think this was in the Win32API but am sure it was someplace with broad application, there’s a base function called DrawOval that draws an ellipse instead of an oval. And we all remember the Intel division bug. I’m wondering if there is some deep error someplace that propagates to all log axes on the web.
I have experienced trouble trying to export graphs in and out of Microsoft applications. For instance, if I try to convert an Excel plot into some sort of vector graphics format, the results are usually shit. You can use Powerpoint to convert the graph into Microsoft’s .emf vector graphics format, which invariably introduces distortions. Similarly, if I want to take beautifully formatted vector graphics plots made by another program and put them in Powerpoint, I’m again stuck with the shit that is .emf, which screws up the position and scale of every element in the plot. (I’ve settled on using a ton of high-resolution bitmaps, because it’s the only way I’ve found to get acceptable images into Powerpoint.)
More generally speaking, IME there’s a lot of software with sloppy vector graphics capabilities.
I noticed a different amazing ignorance a few days ago. Googling “image latitude” one finds many sites like this one which think latitude lines are equally spaced along the axis. :smack: (And that page is a “lesson.”)
Besides Occam’s Razor (never ascribe to fraud or malice what can be explained by ignorance or stupidity), I have no answer to OP. I post merely to compliment for a great username/topic combination.
Indeed.
For your actual problem… I don’t know. Can you tell if they’re all off in the same way? Like if the line that should be for 5 was always at 4.5, and the others likewise at the same wrong place. Then you could correct for the wrong tick marks when you digitize the data. You’d have to find some data where you do have independent versions of the data, to calibrate with.
“All data is linear when plotted on a log-log plot with a big enough crayon.” – Anon.
In my experience, digitizing plots that were originally produced ditigally is always fraught with errors and problems. As mentioned, Microsoft products–and in particular, my technical nemesis, the Excel spreadsheet application–produce plots that are fine enough for business purpose (provided you replace the craptastic default templates with ones that use a decent color scheme and appealing layout) but can be highly misleading if you are plotting a large amount of highly variable data. In particular, I’ve found enormous discrepencies with box plots, high fidelity scatter plots, calculating and plotting regression lines, and especially anything plotted on a log axis. I don’t know if this is just Excel’s shitty handling of logarithmic values or some fundamental fail in how it distributes the log axis, but I wouldn’t trust it for any amount of money. Any yet, I am regularly surprised by how many companies use Excel to calculate and plot technical values and don’t backcheck the results.
And I’m sure these errors are not limited to just Microsoft products. As lazybratsche notes, there are a number of vector graphics formats which do not maintain even scaling. The ISO32000 “Portable Document Format” (commonly known as PDF or .pdf) is notorious for this, which is why engineering drawings converted to PDF format should never be scaled for interpretation. (This is true in general, but especially true with digital formats.) So unless you are digitizing (which is a type of scaling) from the original analog source (in the case of hand graphs or pen plotter data) you should treat any previously scanned, photographed, or digitally plotted graph as suspect at best, and to be used for reference only. I say this as someone who routinely digitizes old analog data.
Is there a reason you can’t request the raw source data used to produce the graphs directly from the manufacturers? I know they are often reluctant to provide anything that might be considered proprietary, but given that you are just asking for data on a publically released plot which is presumably advertizing the product capabilities to the end user I would expect that they would be willing to assist you. Of course, you then have to make sure that they aren’t just scaling off of the same plots and giving you the same craptastic interpolation of bad numbers, but that is the universal pitfall of data; if you don’t control it, you can’t trust it.
Oh, dang. It looks like I’m in error here. Ellipses are well defined conic sections, but “oval” is not so precisely defined, and by many definitions “oval” includes “ellipse” as a special case. “Oval” seems to mean a roundish two dimensional figure, concave inward at all points.
I thought “oval” specifically meant two semicircular ends of the same radius, joined by two straight line segments tangent to the ends, like the race track used by runners, or like a belt around same size pulleys.
My mistake!
Note per the previous post, geometry isn’t my specialty…
Every explicit (e.g. graphically useful) description of an oval has the ends as circular (constant radius) section connected by curves which are tangent at the connections and symmetric about an axis going through the radial center of the ends, forming a smoothly continuous purely convex (or in the case of stright connecting lines, flat) parametric curve. This is the definition used for descriptive geometry in techical drawings (an oval slot), mechanism kinematics, an other areas where oval shapes are described by analytical geometry . Similarly, and ovoid shape has spherical endcaps connected by a smoothly continuous tangent surface. Ellipses and ellipsoids, which have a very distinct topology (i.e. an acutely inclined conic section defined by five points with a constantly changing radius, except in the case of the degenerate shape with e=1, i.e. a circle, and the axial rotation of this shape for an ellipsoid) are not an oval or ovoid, nor vice versa.
I couldn’t replicate the problem. I see them all the time, but didn’t see one from a manufacture - found one from a math website - and the lines line up (they were a pixel or two off - I’m guessing due to rounding.
How large are these graphs? If you are talking 100 pixels wide - maybe it is rounding issues.
Come to think of it, I’ve seen some software with sloppy raster graphics plotting. By necessity there has to be some rounding and approximation between floating-point data and a 320x240 plot. Nearest-neighbor interpolation of the data or even the axes is going to exacerbate inaccuracy (particularly with a log-log plot). Take that low-resolution image, scale it up and smooth it for printing, and the inaccuracy will only grow.
Be that as it may, I would never describe a shape that is graphically or topologically define as an ellipse or ellipsoid as an oval or ovoid. An ellipse is a very rigorously and explicitly defined shape that is not generically oval. This would be analogous to calling a cube a “box”; it is a wrong desciption by omitting specific salient details. This may be fine in casual speech, but it is wholly inaccurate when discussing any formal treatment or use.
You may never say that, but that doesn’t make it wrong. At any rate, the point you’re arguing for is not analogous to calling a cube a box. It’s analogous to saying a box can not be a cube. That’s wrong.
No, it’s not, and trying to turn the argument on its head is disingeneous. An ellipse is a very specific shape which can be described by five points or parameters, and describing it as an “oval” is misleading at best and completely inaccurate at worst. It is like calling a dophin a fish because it swims in the ocean; while it is true dolphins and fish share certain superficial characteristics, no zoologist would ever conflate them, even in a general sense.
It’s not disingenuous at all. Look at the OP. That is exactly the situation he is reporting. He calls DrawOval, expecting it to draw an oval, and it draws a shape that meets all the requirements of an oval.
And really, equating it to calling a dolphin a fish? That’s just ignorant. A dolphin is categorically not a fish. Dolphins are mammals. They have lungs. Fish have gills. Dolphins do not fit the definition of a fish.
And an ellipse is not an oval, except in the very loose colloqual sense of “a convex round shape” which does not satisfy any analytical or rigorous definition of an ellipse or an oval, i.e. the innate characteristics of an ellipse and and oval are essenitially different.
Oh, do I need provide a cite? Because it’s not hard to find definitions of “oval” satisfied by an ellipse. septimus already linked to Cartesian Ovals, which include ellipses as a subset.