This is something I should know from middle school, but with today’s tools I’m hoping it’s easier to do than it was 40 years ago.
I create line graphs for work using Excel. I need to be able to bisect one of these graphs horizontally at a specific point and determine the area of the curve above and below the line. Is there some easy way to do this that doesn’t involve solving equations?
If someone can provide me a simple example of how it’s done that might help me understand it better.
Important question: How are you drawing your lines? Are you just connecting the dots, or are you fitting a smooth curve to the data points? Also, if you are connecting the dots, do you consider the jagged bunch of line segments to be the “true” curve, or are you more interested in the “fundamental” curve of which that’s just an approximation?
How do you go about generating the graphs?
A simple approximation would be to take the rectangular height of each point on the curve (above and below the horizontal line), multiply by the width of each bin, and sum them.
I am just connecting the dots and not fitting a smooth curve to the data points. I need to include all points under the “curve”, even the jagged bunch of line segments.
If I plot the data for a year it will be mostly flat, but for a few weeks there will be large peaks. After those weeks it becomes mainly flat again at the same level as before the peaks happen. I want to draw a line at the level of the flat area so I can determine how much area is between the flat part and the top of the various peaks.
By slicing the region vertically at the data points, you can carve it up into regions that are easy to calculate the area of (like a rectangle with a triangle on top), and add those individual areas together.
Yes, it’s the one you’re using. Given that you are creating these graphs using Excel, it should be easy to set up a formula to calculate the needed areas from the original data. But without some more details about the problem it will be difficult to show an example.
Look for something involving Simpson’s rule. That’s a numerical method that is more or less what Thudlow Boink was suggesting. The basic idea is to break the area under a curve into small segments, approximate their area as polygons, and sum them. Usually, the smaller the segments, the more accurate the estimate. There are curves for which this doesn’t work well, but from your description, it seems that your problem is one where it would be a reasonable approach.
A quick googling turns up some instructions for doing this in Excel. I’m too lazy to evaluate them though, so I won’t vouch for them, and more than likely, you’ll need to adapt them for your particular situation.
Let me try one more time to explain it. I am tracking RMA data using a simple Excel spreadsheet. On the Y axis is the number of RMAs received and on the X axis is time (weeks), so for a one year long graph I have 52 points, each with varying heights depending on the number of RMAs received for that particular week.
For most of the year the RMA volume is fairly steady, but for 2 or 3 months of the year we see a spike in RMAs. This spike is related to a particular problem we are tracking. So we have the baseline level of RMAs and an additional amount of RMAs due to this specific problem.
I want to know the area under the peaks but above the baseline so I can determine how much additional RMA activity is the result of this particular problem.
Yeah, although I was actually thinking of the “trapezoidal rule.” For an actual curved curve, Simpson’s Rule is more accurate, but the OP was talking about a graph made up of connected straight line segments, if I understood him properly.
Given the additional detail offered in post #9, I too, am wondering if you’re not making this a more complicated problem than it really is. If you have a weekly baseline that represents the norm, and data representing what is really happening, why is it difficult to to compare them on a weekly basis?
I think you may have understood him better than I did. :o
Agreed. To add a concrete example, if the whole calendar is 19 weeks long and weeks 8 through 16 are the ones of interest, just take the baseline from the outside weeks (in the example below, avg = 48.2 per week under normal conditions) and subtract it off. That yields the third column here:
Then just sum up (or average, as desired) the excess rates in the weeks of interest, #8 through #16. Here, the excess in weeks 8 through 16 comes to an average of 33.4 RMAs per week.
You can choose to graph the data and overlay a line at 48.2 as well, but there’s no reason to use that graph as a calculation tool.