Closed Thread
 
Thread Tools Display Modes
  #51  
Old 09-04-2019, 10:28 PM
CarnalK's Avatar
CarnalK is offline
Guest
 
Join Date: Jul 2000
Posts: 18,762
Quote:
Originally Posted by Lance Turbo View Post
One of my side projects is a Python script that pulls data from the 538 publicly available dataset everyday at 8:55am MT and computes the difference between Rasmussen and non-Rasmussen pollsters on Trump approval. It then plots the last 90 days of that data and Tweets out an image of that plot.

Here's the one form 5 minutes ago... link.

Almost all of the variation in the last 90 days or so has been due to Rasmussen.
Quote:
Originally Posted by Lance Turbo View Post
I think I've found where folks are getting hung up. When I said, "Almost all of the variation in the last 90 days or so has been due to Rasmussen," I meant in my chart, not in the 538 chart. In my chart, there is a mostly flat line, a line with big swings, a line that is the difference between the two. Almost all the variation in the difference line is due to the wiggly line, not the flat line.
Ok, I'm not a Data Scientist so could you explain why it's only "almost" all the variations in the difference line are caused by Rasmussen? I mean, to this layman, you are plotting how different Rasmussen is from the average so I would think Rasmussen is entirely to blame for every variation from the average.
  #52  
Old 09-04-2019, 10:30 PM
DSeid's Avatar
DSeid is offline
Guest
 
Join Date: Sep 2001
Location: Chicago, IL
Posts: 22,895
Yes Lance Turbo, I am free to disagree.

More accurately though Silver's description of how his model works and the data showing that the move of his tracker moves most in concert with the higher rated houses' movements as a group is free to disagree.

A single C+ pollster simply cannot move the 538 tracker by any significant degree, no matter how big of an outlier they are. And his poll does not follow the average of all polls published equally weighted either.

You are free to be wrong.



Slow pitch, as best I can ...

Silver's model counts all polls but in Animal Farm fashion some polls are more equal than other polls. A C+ pollster gets very little weight. A pollster does that publishes every day does not increase its weight over a similarly rated pollster that releases once every 20 days. As a C+ pollster Rasmussen has very little impact on 538's numbers, much less than an A+ to B+ pollster that publishes every 3 weeks, no matter how often they publish results. The group of those top rated and heavily weighted pollsters are what drive 538's numbers significantly, and their average change from most recent to next most recent is what drove the (insignificant) "drop" noted by the op. The average of all polls published equally weighted has nothing to do with how the 538 model works.


Lance Turbo seems to think that because sometimes Rasmussen's noise goes the same way as the 538 tracker (mostly driven by the top level houses, not by his Blue line all non-Rasmussen polls) goes, (and of course sometimes the opposite direction too, even within his sample graph) it means that Rasmussen MUST drive the results. It's a pretty absurd conclusion.
  #53  
Old 09-05-2019, 12:14 AM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by CarnalK View Post
Ok, I'm not a Data Scientist so could you explain why it's only "almost" all the variations in the difference line are caused by Rasmussen? I mean, to this layman, you are plotting how different Rasmussen is from the average so I would think Rasmussen is entirely to blame for every variation from the average.
What I was trying to say was in the graphic that I made and linked to, the red line line moves up and and down a lot, and the blue line doesn't move up and down much at all. This means that the the green line moves up and down a lot due mostly to the red line and not the blue line.

That's a pretty indisputably correct description of that graphic.
  #54  
Old 09-05-2019, 12:31 AM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by DSeid View Post
A single C+ pollster simply cannot move the 538 tracker by any significant degree, no matter how big of an outlier they are. And his poll does not follow the average of all polls published equally weighted either.
This is likely incorrect. Even a pollster that has a lower weight can probably move the model with a large enough swing. In particular they could do so when all the other pollsters, on average, are basically static. It has nothing to do with Rasmussen being an outlier and everything to do with Rasmussen trending massively and differently than the average of other pollsters over long periods of time.

Quote:
Originally Posted by DSeid View Post
Lance Turbo seems to think that because sometimes Rasmussen's noise goes the same way as the 538 tracker (mostly driven by the top level houses, not by his Blue line all non-Rasmussen polls) goes, (and of course sometimes the opposite direction too, even within his sample graph) it means that Rasmussen MUST drive the results. It's a pretty absurd conclusion.
This is also incorrect. I have taken steps to remove the day to day noise in the Rasmussen data so I could better capture how Rasmussen trends compared to the average of all other pollsters. Furthermore, my conclusion is not that Rasmussen MUST drive the results. I have presented a hypothesis and data that supports that hypothesis. I certainly could be wrong because my hypothesis relies on unverifiable assumptions about the 538 model.

DSeid thinks my assumptions are incorrect. DSeid is also making several unverifiable assumptions about 538's model which I think are incorrect.
  #55  
Old 09-05-2019, 01:07 AM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Also I am not claiming to have proved that Rasmussen is solely responsible.

I am guessing that Rasmussen is noticeably responsible. My guess, which may be wrong, is backed up by evidence.

Mostly I just wanted to share my sweet-ass Rasmussen tracker Tweet-bot.

Since no one asked, I'll tell you a funny story about that tracker. This is in no way intended to bolster my argument in this thread. It's just an amusing anecdote.

One afternoon I did not check Twitter for a few hours and was shocked to open it up and see a bunch of notifications. A brief investigation revealed that Rasmussen Reports had Tweeted a screen cap of that day's graph.

And then... Nate Silver retweeted the Rasmussen Tweet to throw a little shade. Nate has 3.2 million followers, so that explained all my notifications.

The funny part of the story is that I change my Twitter handle from time to time. Often it is a thinly veiled Cubs reference like Dr. Addison Sheffield or Dr. Eamus Catuli. Sometimes it is a reference to some bit of news that I found amusing like Dr. Humboldt Gator or Dr. Dark Matter Hurricane. Near the time of this incident Dr. Marijuana Pepsi, the real name of a woman who had just earned her PhD, was trending all over the internet so for some dumb ass reason I changed my handle to Dr. L.S.D. Snapple for a brief time.

So the one, and probably only, time Nate Silver will ever mention me plays out like this...

Rasmussen: "Our thanks to Dr. L.S.D. Snapple for his illustration..."

Nate Silver: "I'm not a polling expert like Dr. L.S.D. Snapple..."
  #56  
Old 09-05-2019, 09:34 AM
DSeid's Avatar
DSeid is offline
Guest
 
Join Date: Sep 2001
Location: Chicago, IL
Posts: 22,895
Quote:
Originally Posted by Lance Turbo View Post
What I was trying to say was in the graphic that I made and linked to, the red line line moves up and and down a lot, and the blue line doesn't move up and down much at all. This means that the the green line moves up and down a lot due mostly to the red line and not the blue line.

That's a pretty indisputably correct description of that graphic.
The red line is for the immediate loading and unloading of passengers only ...


My last crack at this -

My only assumption is that Silver's model is run the the way Silver states it is run. True I cannot verify that he actually doesn't do something completely different.

But testing that as a hypothesis I took eight of the highest rated polling houses, which by Silver's description should be the biggest drivers of his tracker's numbers, and looked at their movement over the past 4 to 6 weeks. On average they had a net drop of 4.5. Which supports the hypothesis that Silver is not lying about how his model works.

The hypothesis that Silver is not lying and that the highly rated houses are the bigger inputs is supported.


Now let's test your hypothesis that Rasmussen's movements are the significant factor in the 538 tracker's movement over this past 4 to 6 weeks period.

By your graph over that period of time Rasmussen's 14 day moving average dropped something like 3. Does not seem so different or big as to drive much when averaged with a bunch of others.

There's over 20 polling houses going into the 538 algorithm.

For shits and giggles let's imagine C+ rated Rasmussen was weighted as much as 50% as much those A+ to B+ ones (a likely over-estimate, but that is an unverifiable assumption), that C+ to B all gets the same (also a generous assumption), and that instead of the top tier showing that drop over that time period they had all, except for Rasmussen, been flat. And we'll call it 20 going into the tracker. How much would the tracker had moved because of Rasmussen's moving 3? 0.1%.

How about if it had moved 5 while everything else stayed flat. O.17% Nonzero but still not very "noticeable". If it had moved 15?!? 0.5%. Can a pollster that has a lower weight probably move the model with a large enough swing? Not by a significant amount even with a very large swing.

How about we assume that Silver is lying and Rasmussen is not underweighted, everyone counts the same? A move of 5 while everyone else was flat would move the tracker by 0.25%, and a Rasmussen move of 15 would move the tracker all of 0.75%.


If your hypothesis or opinion is that the shifts you've seen in Rasmussen are responsible for significant moves of the 538 tracker then it is soundly falsified.


Hell let's overweight Rasmussen! Instead of a lower weighted one of over 20 polling houses, instead of even equal weight as all the others of 20, we'll credit it with being weighted twice as high, it is as if it was one tenth of the tracker all by itself. How much does it need to move to move the tracker by the 3 the tracker has moved if everyone else averages out flat? 30 points.

Maybe if you assume that Silver is lying and he actually counts every day's result of Rasmussen as much as he counts ones that report every 3 weeks or less often. Then it and the other frequent reporters like Harris would dominate, even if they were lower weighted. Maybe. And I guess Silver not lying is unverifiable.



Nice Tweet-bot?
  #57  
Old 09-05-2019, 10:02 AM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
I am not assuming Silver is lying. That's absurd.
  #58  
Old 09-05-2019, 10:23 AM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Assume there are two pollsters, Q and R. Q is a A+ rated pollster that polls once a month and R is a C+ rated pollster that polls everyday.

Let's say we have 2 data points from Q on day 1 and day 30 of some time period and both those data points are Trump -10. If we put those 2 data points, and only those 2 data points, into the 538 model my assumption is that it would return a flat line at Trump -10.

Furthermore say we have 30 data points from R and each data point for days 1 through 30 is Trump +0. If we put those 30 data points, and only those 30 data points, into the 538 model my assumption is that it would return a flat line at Trump +0.

Now let's put all 32 data points into the model. My assumption is that at day 1 and at day 30 the model outputs something quite close to Trump -10 because Q is so much more highly rated than R. However, I'm not sure where exactly to excpect the model's output to be on day 29. Even though Q is highly rated, it's data is four weeks old at that point. My assumption is that the model puts very little weight on the four week old old data from Q and outputs something closer to R on day 29. Possibly something around Trump -2 or Trump -1. This assumption does not require Nate Silver to be lying.

What do you think the 538 model outputs on day 29 in this hypothetical DSeid?
  #59  
Old 09-05-2019, 11:37 AM
DSeid's Avatar
DSeid is offline
Guest
 
Join Date: Sep 2001
Location: Chicago, IL
Posts: 22,895
In a hypothetical world in which there are only two pollsters going into the model, one A+ and one C+, then every rational analyst would keep the only A+ poll last report active and driving the numbers until it was replaced by that house's new results. 538 is run by rational analysts.

On day 29 the model would output something quite close to Trump -10.

With only one quality pollster going into a model reporting only monthly any rational analyst would state that we have little ability to report on any interval change between those reporting dates. If a significant event occurred in the interim we would look to the C+ rated daily tracker to get some very tentative sense of directionality and magnitude of impact but its impact on the bottom line number output would still be minimal until the A+ one reported.

Of course with multiple high quality houses and more medium quality houses providing input to the aggregate, aging polls can get weights decreased as they age, with more recent quality house results having more strength and more frequently reporting houses with more like B and B- ratings getting pooled together, weighted appropriately lightly, but able to reflect shorter term changes. For 538 that means B rated YouGov contributes most to catching the most immediate changes. Rasmussen and HarrisX not so much. Ipsos a bit less immediate but still fairly frequent.

But I'll play even more with your hypothetical. Let assume for the sake of discussion a less than completely rational analyst and in that world the A+ poll loses weight dramatically by day 29. What would be the output by 538's methods? Still about -10. See the other thing that 538 does is shift for house effect and the C+ house, with its 10 point house effect, would have been shifted. Since it showed no change over those days it would move the output by zero. (Note same would be functionally the case if it was the A+ house with the house effect or they each had ones that contributed.) This helps make their output less volatile as different pollsters age out or drop new results.

PLEASE NOTE, and this was my main bit for this thread, their sophisticated methods do not completely remove all false signal volatility/noise resultant of the timing of new data inputs, the aging process, and MOEs or other error sources within each poll. I think it is as close to a gold standard as we can get but movements within that +/- 2 points that are not "real" may have to be expected even with the most sophisticated aggregation approach.
  #60  
Old 09-05-2019, 12:11 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by DSeid View Post
In a hypothetical world in which there are only two pollsters going into the model, one A+ and one C+, then every rational analyst would keep the only A+ poll last report active and driving the numbers until it was replaced by that house's new results. 538 is run by rational analysts.
This is where our assumptions are different. It is entirely rational to weight fresh C+ data higher than four week old A+ data. It's a C+ not an F-. I would be stunned if 538 did not have some sort of time decay for polls built into their model. A rational analyst strikes a balance between using the highest quality data and the freshest data.

There is exactly one University of Maryland/Washington Post poll in the 538 data set. They are rated A+ and the number in the weight column is 2.99 which is the highest weight in the whole data set. It was in the field Sept. 27-Oct. 5, 2017. My assumption is that this poll is given zero weight in calculating today's data point. Your assumption is that that poll has been given the same weight for almost two years while they wait for another University of Maryland/Washington Post poll.

Frankly, your assumption is bonkers.

Quote:
Originally Posted by DSeid View Post
But I'll play even more with your hypothetical. Let assume for the sake of discussion a less than completely rational analyst and in that world the A+ poll loses weight dramatically by day 29. What would be the output by 538's methods? Still about -10. See the other thing that 538 does is shift for house effect and the C+ house, with its 10 point house effect, would have been shifted. Since it showed no change over those days it would move the output by zero. (Note same would be functionally the case if it was the A+ house with the house effect or they each had ones that contributed.) This helps make their output less volatile as different pollsters age out or drop new results.
Good point about house effect, but I'd like to ignore it for the purpose of this hypothetical since it has no bearing on whether or not a big swing from a lower rated pollster can move the 538 needle. The hypothetical is not quite at that point yet, but that's where this is heading.
  #61  
Old 09-05-2019, 12:31 PM
Fiddle Peghead's Avatar
Fiddle Peghead is offline
Guest
 
Join Date: Mar 2001
Location: Harlem, New York, NY
Posts: 4,496
Quote:
Originally Posted by Lance Turbo View Post
The funny part of the story is that I change my Twitter handle from time to time. Often it is a thinly veiled Cubs reference like Dr. Addison Sheffield..."
Oh, that Dr. Addison Sheffield. Wait, who the hell is Dr. Addison Sheffield?

Quote:
This is where our assumptions are different. It is entirely rational to weight fresh C+ data higher than four week old A+ data. It's a C+ not an F-. I would be stunned if 538 did not have some sort of time decay for polls built into their model. A rational analyst strikes a balance between using the highest quality data and the freshest data.
This was my first thought about DSeid's hypothetical. A C+ is far from perfect, but would it's weight not be adjusted relative to an older A+ poll, even by a small amount? Reading again, I think DSeid actually addressed this....

Last edited by Fiddle Peghead; 09-05-2019 at 12:35 PM.
  #62  
Old 09-05-2019, 12:56 PM
DSeid's Avatar
DSeid is offline
Guest
 
Join Date: Sep 2001
Location: Chicago, IL
Posts: 22,895
In a hypothetical world with only two pollsters? Not by much. I'd be saying that we can only reliably report data out monthly and in between we can get a hint as to whether or not an event has had some impact but only a hint. C+ is a pretty poor rating and such a house should be, by itself, mostly ignored (only worthwhile in aggregate with many others weighted lightly).

But of course the world we have is multiple polling houses reporting at different intervals with different ratings. The more heavily weighted daily tracker (with each day discounted) would be YouGov which went from 7/8-9's of -8 to 8/27-28's of -11. With some daily noise of course and using the RV set. (Hitting -13 some.) Net drop of 3 as the trend.

Again consistent with what 538 reported out.

Harris and Rasmussen are rated low have each little impact by themselves.
  #63  
Old 09-05-2019, 01:48 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
It's not a hypothetical world in which there are only two pollsters.

I'm asking about entering hypothetical data from exactly two hypothetical pollsters into the 538 model that exists in the real world. For the sake of argument assume these two hypothetical pollsters have a perfect zero house effect.

Quote:
Originally Posted by DSeid View Post
Harris and Rasmussen are rated low have each little impact by themselves.
This is almost certainly not true.
  #64  
Old 09-05-2019, 06:33 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by Fiddle Peghead View Post
Oh, that Dr. Addison Sheffield. Wait, who the hell is Dr. Addison Sheffield?
I'm not sure if you're really asking for more information here, but the right field gate at Wrigley Field is right at the intersection of Addison and Sheffield.
  #65  
Old 09-05-2019, 08:44 PM
Snowboarder Bo's Avatar
Snowboarder Bo is offline
Member
 
Join Date: May 2005
Location: Las Vegas
Posts: 27,597
"1060 West Addison… that's Wrigley Field!"
  #66  
Old 09-05-2019, 11:33 PM
Happy Fun Ball is offline
Member
 
Join Date: Jan 2000
Location: The down hill slope
Posts: 3,184
Quote:
Originally Posted by Lance Turbo View Post
One of my side projects is a Python script that pulls data from the 538 publicly available dataset everyday at 8:55am MT and computes the difference between Rasmussen and non-Rasmussen pollsters on Trump approval. It then plots the last 90 days of that data and Tweets out an image of that plot.

Here's the one form 5 minutes ago... link.

Almost all of the variation in the last 90 days or so has been due to Rasmussen.
Wow, that's great! What other side projects do you have?
  #67  
Old 09-05-2019, 11:58 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by Happy Fun Ball View Post
Wow, that's great! What other side projects do you have?
I think the following is a complete list of my Twitter-bots.

Baseball
NL West Daily Tracker
NL Central Daily Tracker
Cubs Cardinals Comparator
Cubs Brewers Comparator
Cubs Cards Brewers Three Way Race
Ian Happ News Tracker
Rockies Taco Bot

Miscellaneous
Daily Florida Man
538 Congress Tracker Tracker
Iowa Caucus Countdown/Throw Shade at Ted Cruz
  #68  
Old 09-06-2019, 11:25 AM
DSeid's Avatar
DSeid is offline
Guest
 
Join Date: Sep 2001
Location: Chicago, IL
Posts: 22,895
For anyone interested in how the 538 model actually works (and why a C+ daily tracker is CERTAINLY of little impact to it) here is their fairly detailed explanation. While going back and forth with an individual poster is silly, some here might be interested in the weeds of their methods, and the details do inform as to the original op.

Again, of note, they handle frequently reporting houses thusly:
Quote:
If it does so more often than about once per 20 days, each instance of the poll is discounted so that the pollster doesnít dominate the average just because itís so prolific. Daily tracking polls also receive special handling from the formula so that interviews are not double-counted.
The maximal impact of a polling result is determined by the weight as determined by its rating. That rating also determines how quickly the weight drops off.
Quote:
The worse the pollsterís rating, the quicker it encounters diminishing returns in our formula
Each instance of a C+ house not only has little weight but its weight decays very quickly; each instance of a highly rated house not only has high weight but its weight decays slowly. Each instance of a C+ daily tracker has a small fraction of the maximum a C+ result could have and decays quickly.

From there the curve is drawn using local polynomial regression with aggressive smoothing.


Their discussion of uncertainty is the most pertinent to this thread however. Short version is that the shaded bars on their graph - those are best understood (but not really) as 90% confidence bars. Note that they go +/- roughly 4 or 5 points. In fact Trump's numbers have been within +/- 2 of a net approval of -12 most of the time over the past year and a half. It is likely not worthwhile to read too much into movements until they are staying out of that range for more than a few days.
  #69  
Old 09-06-2019, 11:33 AM
elucidator is offline
Charter Member
 
Join Date: Mar 2000
Location: Further
Posts: 60,189
At any rate, the massive backlash that will propel Trump to victory in 2020 has not yet manifested? Well, that's reassuring....
__________________
Law above fear, justice above law, mercy above justice, love above all.
  #70  
Old 09-06-2019, 11:39 AM
scr4 is online now
Guest
 
Join Date: Aug 1999
Location: Alabama
Posts: 16,223
Quote:
Originally Posted by elucidator View Post
At any rate, the massive backlash that will propel Trump to victory in 2020 has not yet manifested? Well, that's reassuring....
His approval ratings are already high enough for him to win. Trump's favorability rating right before the election was 40%, same as his current approval rating.
  #71  
Old 09-06-2019, 11:55 AM
CarnalK's Avatar
CarnalK is offline
Guest
 
Join Date: Jul 2000
Posts: 18,762
Quote:
Originally Posted by Happy Fun Ball View Post
Wow, that's great! What other side projects do you have?
I've got to ask, what exactly impresses you about a graph that shows one pollster has different movement than the average of all other pollsters? Or are you just impressed by the existence of a Twitter bot?
  #72  
Old 09-06-2019, 12:02 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by DSeid View Post
For anyone interested in how the 538 model actually works (and why a C+ daily tracker is CERTAINLY of little impact to it) here is their fairly detailed explanation.
Nothing you have posted after this supports your level of certainty. I have read the same things that you have about how the 538 model and have come to different conclusions than you have. Can't we simply disagree without you several times attributing to me positions I don't hold and belligerently creating bizarre conspiracy theories that require me assuming Nate Silver is lying.

Some of what you just posted actually supports my argument. "The worse the pollsterís rating, the quicker it encounters diminishing returns in our formula." Implies that even A+ pollsters impact on the model decays over time which is consistent with fresh C+ data having more impact on the model than four week old A+ data.

538 took steps so that a daily tracker wouldn't dominate the model, but they didn't reduce it to the point where it couldn't effect the model at all, your certainty to to the contrary notwithstanding.
  #73  
Old 09-06-2019, 12:09 PM
CarnalK's Avatar
CarnalK is offline
Guest
 
Join Date: Jul 2000
Posts: 18,762
You are being ridiculous. Of fucking course the Rasmussen numbers have some effect or else they wouldn't be included at all. That doesn't remotely support your idea that they are a driving force behind the minor dip in approval rating. And right in your quote of DSeid, he says little impact not a certainty of no impact. Jesus.

Last edited by CarnalK; 09-06-2019 at 12:14 PM.
  #74  
Old 09-06-2019, 12:14 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by CarnalK View Post
You are being ridiculous. Of fucking course the Rasmussen numbers have some effect or else they wouldn't be included at all. That doesn't remotely support your idea that they are a driving force behind the minor dip in approval rating.
The point of my last post post was that all other things being equal, they certainly could be.

The data I've posted in other posts does indeed support this possibility, although it doesn't prove that Rasmussen is solely responsible.
  #75  
Old 09-06-2019, 12:21 PM
CarnalK's Avatar
CarnalK is offline
Guest
 
Join Date: Jul 2000
Posts: 18,762
Your data doesn't support the possibility, it merely fails to refute it. Big difference. Your data shows nothing other than Rasmussen is different than the polling average.

Eta: and I hope to god you know that when Nate said "I'm not a polling expert like Dr. L.S.D. Snapple..." , he was joking.

Last edited by CarnalK; 09-06-2019 at 12:25 PM.
  #76  
Old 09-06-2019, 12:23 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by CarnalK View Post
Your data doesn't support the possibility, it merely fails to refute it. Big difference. Your data shows nothing other than Rasmussen is different than the polling average.
And that difference is in the same direction at the same time as the events that inspired this thread. Therefore it is possible that that difference is partially responsible for those events.
  #77  
Old 09-06-2019, 12:31 PM
CarnalK's Avatar
CarnalK is offline
Guest
 
Join Date: Jul 2000
Posts: 18,762
It is no doubt partially responsible, just like every other pollster that moved downwards is partially responsible. It's a meaningless truism.
  #78  
Old 09-06-2019, 12:47 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by CarnalK View Post
It is no doubt partially responsible, [...]
You should be having this argument with DSeid. He strongly disagrees as far as I can tell.

Here's my argument in three bullet points.
  1. All other things held equal, Rasmussen can move the needle on the 538 tracker.
  2. In the time period in question I have posted evidence that Rasmussen moved in the direction of the dip that inspired this thread.
  3. In the time period in question I have posted evidence that all other things held equal.

I'm not sure which of those you disagree with anymore. I posted as an interesting observation and other posters also found it interesting. I guess you don't find it interesting. Can't you just say that once and move on?
  #79  
Old 09-06-2019, 12:52 PM
CarnalK's Avatar
CarnalK is offline
Guest
 
Join Date: Jul 2000
Posts: 18,762
Quote:
In the time period in question I have posted evidence that all other things held equal.
Are you saying Rasmussen was the only pollster that moved downwards in the last 40 days?
  #80  
Old 09-06-2019, 02:04 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by CarnalK View Post
Are you saying Rasmussen was the only pollster that moved downwards in the last 40 days?
Clearly not.
  #81  
Old 09-06-2019, 02:09 PM
CarnalK's Avatar
CarnalK is offline
Guest
 
Join Date: Jul 2000
Posts: 18,762
So when you say "all other things held equal" , you meant something other than all other things held equal?
  #82  
Old 09-06-2019, 02:17 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by CarnalK View Post
So when you say "all other things held equal" , you meant something other than all other things held equal?
I meant in aggregate. As I have shown and explained. I'm not sure how I could have made that clearer.

Last edited by Lance Turbo; 09-06-2019 at 02:19 PM.
  #83  
Old 09-06-2019, 02:30 PM
DSeid's Avatar
DSeid is offline
Guest
 
Join Date: Sep 2001
Location: Chicago, IL
Posts: 22,895
Quote:
Originally Posted by Lance Turbo View Post
You should be having this argument with DSeid. He strongly disagrees as far as I can tell. ...
(I really need to work on my self-control and stop engaging with silliness like this.)

"Partially" is a no duh of course. They are part of the 538 aggregation and have some, very marginal, pretty infinitesimal, impact.

Let's for now ignore all the highly rated houses reporting less than weekly (even though their weights are huge, decrease slowly, and there on average two putting fresh fully weighted data into the aggregate weekly).

We have two C+ rated daily trackers. Low weights and with weight decaying very quickly.

We have one B rated daily tracker. Moderate weight decaying slowly.

And we have one B- rated and one B+ weekly tracker. Average them as B, as moderate weight decaying slowly.

Again, completely ignore all the various A+ to B+ houses popping fresh data with high weights that decay slowly but that post less than weekly, and, unless I've missed one, we have five trackers that report weekly or more, three averaging B and two C+.

If all but one of the C+ ones stay flat over a month, how much would the C+ one have to move to cause the aggregate of them to move 2 points?

Well we don't actually know how little 538 weights a C+ poll and how quickly it decays relative to a B one ... so let's play more with hypotheticals.

The limit would be that 538 actually weights the C+ one exactly as much as the B ones, right? They don't but given that absurd extreme assumption, counting all five of these as the same weight, and looking only at these weekly to daily polls ... one of five equally weighted would need to move ten to move the average of that restricted group by 2 if all the others were static.

If the result of being a lesser weight that decays more quickly is to decrease the impact of the C+ one to half the effective weight of a B one, which is likely more in the ballpark, then the one C+ house would need to move sixteen to move the needle by that much.

Meanwhile of course on average about two B+ or higher, much more heavily weighted and slower decaying, houses that each report less frequently than weekly, drop results each week ... (For example in the last two weeks new numbers from two A- rated and two B+ rated houses, and another A- and an A+ one in the two days before that.)


I think it is your "in aggregate" that is messing you up. Your "in aggregate" has nothing to do with 538's aggregation and is immaterial information. You are basically counting a lot of crappy Harris polls that, like Rasmussen, have little input into 538's outputs, and some decent YouGovs, and then a few others of variable quality that pop in that you do not correct by house effect, which adds up to nothing but self-cancelling noise landing as flat line.

If Rasmussen was not included in the 538 tracker the tracker line would virtually (but not exactly) the same as it is now.
  #84  
Old 09-06-2019, 02:36 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by DSeid View Post
If Rasmussen was not included in the 538 tracker the tracker line would virtually (but not exactly) the same as it is now.
Likely false in this particular case. And certainly false as a general rule.

Last edited by Lance Turbo; 09-06-2019 at 02:40 PM.
  #85  
Old 09-06-2019, 02:43 PM
CarnalK's Avatar
CarnalK is offline
Guest
 
Join Date: Jul 2000
Posts: 18,762
Your simplistic aggregate is not "all other things". Is that not clear?
  #86  
Old 09-06-2019, 02:50 PM
DSeid's Avatar
DSeid is offline
Guest
 
Join Date: Sep 2001
Location: Chicago, IL
Posts: 22,895
One other item to note from all of the above - one should not look to the 538 aggregate to assess the immediate impact of a news event.

Say there is some new Trump idiocy that disgusts many - even all three daily trackers moving the same direction the same magnitude will move the 538 needle only so much. It has to be lasting enough to get the weekly trackers and maybe a new result from a highly rated house or two that was in the field after the event reporting out until the 538 needle will start to show it. Today's number, for example, do not yet reflect any impact, if any, of his handling of Hurricane Dorian.
  #87  
Old 09-06-2019, 03:16 PM
CarnalK's Avatar
CarnalK is offline
Guest
 
Join Date: Jul 2000
Posts: 18,762
July 14-31, Gallup had Trump at 42% approval. Rasmussen had him at 48% July 28-30. On Aug15-30, Gallup was at 39%, Aug 26-28 Rasmussan had him at 46%. So over the last 40 days, a higher rated pollster had a bigger drop than your bogeyman. In fact, Gallup had a large jump in the disapproval number so the net approval was an even bigger move. Try running your bot on Gallup and tell me you don't see the same "evidence" that Gallup is "partially responsible " for the 538 graph movement.
  #88  
Old 09-06-2019, 05:17 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by CarnalK View Post
Your simplistic aggregate is not "all other things". Is that not clear?
My simplistic aggregate is precisely all other things.
  #89  
Old 09-06-2019, 05:35 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by CarnalK View Post
July 14-31, Gallup had Trump at 42% approval. Rasmussen had him at 48% July 28-30. On Aug15-30, Gallup was at 39%, Aug 26-28 Rasmussan had him at 46%. So over the last 40 days, a higher rated pollster had a bigger drop than your bogeyman. In fact, Gallup had a large jump in the disapproval number so the net approval was an even bigger move. Try running your bot on Gallup and tell me you don't see the same "evidence" that Gallup is "partially responsible " for the 538 graph movement.
Why don't you do that? I could help you if you really wanted to find out. I suspect you will be disappointed with the results.

Here's why... the approval number in the 538 model drops from 42.5 to 41.3 from Aug 1 to Aug 30 with no new information from Gallup during that period. The 538 model doesn't slope downward in anticipation of a drop in the Gallup poll. Between the release dates of the two Gallup polls we have one decaying Gallup data point. I suppose if it was high enough to pull the model up its decay could contribute to a decline in the model.

That said, I didn't say anything at all about Gallup before this post. Deeper analysis may indeed reveal that they are a contributor to this particular drop. That doesn't really change anything I said about Rasmussen. Also Rasmussen is not my bogeyman. I'm not sure where that comes from.
  #90  
Old 09-06-2019, 05:44 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Here is the code (Python 3) with the parts that post it to my Twitter removed. Feel free to adapt it to your purposes and report back here. Have fun.

Code:
import pandas as pd
import datetime


#Get data
url = 'https://projects.fivethirtyeight.com/trump-approval-data/approval_polllist.csv'
df_raw = pd.read_csv(filepath_or_buffer = url, parse_dates = ['startdate', 'enddate'])

# Drop columns we don't need
df = df_raw.loc[:,['pollster', 'startdate', 'enddate', 'approve', 'disapprove']]

# Add net job approval column
df['net'] = df['approve'] - df['disapprove']

# Add average date column
df['avgdate'] = df['startdate'] + (df['enddate'] - df['startdate']) / 2

# Restrict to only columns of interest
df = df.loc[:,['avgdate', 'pollster', 'net']]

# Split into Rasmussen, non-Rasmussen
df_ras = df[df['pollster'] == "Rasmussen Reports/Pulse Opinion Research"]
df_nonras = df[df['pollster'] != "Rasmussen Reports/Pulse Opinion Research"]

# Helper function. Compute average of 'Net' column for rows within delta days (+-) of thedate
def rollingavg(thedate, thedf):
    days = 7
    delta = datetime.timedelta(days=days)
    temp = thedf[(thedf['avgdate'] >= thedate - delta) & (thedf['avgdate'] <= thedate + delta)]
    return temp['net'].mean()
  
# Create final dataframe
min_date = df_raw['startdate'].min()
max_date = df_raw['enddate'].max()
df_final = pd.DataFrame(pd.date_range(min_date, max_date), columns = ['Date'])
df_final['Rasmussen'] = df_final['Date'].apply(lambda x: rollingavg(x, df_ras))
df_final['nonRasmussen'] = df_final['Date'].apply(lambda x: rollingavg(x, df_nonras))
df_final['Difference'] = df_final['Rasmussen'] - df_final['nonRasmussen']
df_final = df_final.set_index('Date')

#Do things to separate head and tail for plot

df_solid = df_final.tail(90).head(83)
df_dots = df_final.tail(8)
df_dots = df_dots.rename(columns={"Rasmussen": "Rasmussenx", 
                                  "nonRasmussen": "nonRasmussenx", 
                                  "Difference": "Differencex"})
df_combined = pd.concat([df_solid, df_dots], axis=1, join='outer')

#Make plot
aplot = df_combined.plot(title = 'Trump Approval (last 90 days)',
                         color = ['r','b','g']*2,
                         style = ['-']*3 + [':']*3)
aplot.legend(['Rasmussen', 'non-Rasmussen', 'Difference'], frameon=False)
aplot.tick_params(labelbottom=True, labeltop=False, labelleft=True, labelright=True,
                  bottom=True, top=False, left=True, right=True)
fig = aplot.get_figure()
  #91  
Old 09-06-2019, 07:46 PM
CarnalK's Avatar
CarnalK is offline
Guest
 
Join Date: Jul 2000
Posts: 18,762
You started a bot to record how far off Rasmussen is from the average. Maybe it's not your bogeyman but that is an odd amount of attention to pay to a particular pollster.

And no, I'm not going to set up a database to test this. The last 40 days have had a meaningless blip of change, I don't need to dig for answers to a meaningless blip.

Last edited by CarnalK; 09-06-2019 at 07:50 PM.
  #92  
Old 09-06-2019, 08:07 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by CarnalK View Post
You started a bot to record how far off Rasmussen is from the average. Maybe it's not your bogeyman but that is an odd amount of attention to pay to a particular pollster.
Rasmussen seems to get mentioned an odd amount by a particular president so from time to time I would check how they were doing doing compared to the field. I knew they had a republican lean, but I didn't know if that lean was consistent. One of the times I was looking in to this was around the time I was looking for a project to use as proof of concept for some sort of data visualization connecting a data source to my Twitter account using the Twitter API. I choose the 538 approval database as data source and use that to visualize a question that happened to be on my mind that day. I spent about two hours writing this bot back in January and look at its output once in a while when Rasmussen is trending for some reason or other.


Quote:
Originally Posted by CarnalK View Post
And no, I'm not going to set up a database to test this. The last 40 days have had a meaningless blip of change, I don't need to dig for answers to a meaningless blip.
No one asked you to set up a database. I handed you the code. You seem to be paying an odd amount of attention to something you find meaningless. Other people find this topic interesting. Does that bother you for some reason?
  #93  
Old 09-06-2019, 08:21 PM
CarnalK's Avatar
CarnalK is offline
Guest
 
Join Date: Jul 2000
Posts: 18,762
I do find the topic interesting. I just think you are talking nonsense about the topic.
  #94  
Old 09-06-2019, 08:33 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by CarnalK View Post
I do find the topic interesting. I just think you are talking nonsense about the topic.
That's unfortunate.

You could have taken this opportunity to learn something data modeling, using smoothing functions to reduce the effects of noise, time decay of data points, and even some pandas dataframe basics.

You chose to go a different way.
  #95  
Old 09-06-2019, 09:02 PM
CarnalK's Avatar
CarnalK is offline
Guest
 
Join Date: Jul 2000
Posts: 18,762
No, I learned some stuff. Thanks, DSeid!
  #96  
Old 09-06-2019, 09:22 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Two days ago DSeid believed that no rational analyst would time decay data.
  #97  
Old 09-06-2019, 09:32 PM
DSeid's Avatar
DSeid is offline
Guest
 
Join Date: Sep 2001
Location: Chicago, IL
Posts: 22,895
Carnal K you are welcome. It really is not worth engaging with the silliness that Lance Turbo is posting on this. We can use our time more productively ... say by engaging with the guy muttering on the street corner.
  #98  
Old 09-06-2019, 09:41 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by DSeid View Post
Carnal K you are welcome. It really is not worth engaging with the silliness that Lance Turbo is posting on this. We can use our time more productively ... say by engaging with the guy muttering on the street corner.
You mad bro.

You can admit it. You didn't consider time decaying data before, but you included it in your last analysis. You learned something. You're welcome.

Your failure to include time decay of data led directly to several erroneous conclusions earlier in this thread. Now that won't happen again. This new knowledge is a difference maker.
  #99  
Old 09-06-2019, 10:06 PM
DSeid's Avatar
DSeid is offline
Guest
 
Join Date: Sep 2001
Location: Chicago, IL
Posts: 22,895
(I can't help myself.)

[Deleted to not get warning] ... in a hypothetical universe in which there were only two pollsters, one A+ reporting monthly, and one C+, reporting daily, the C+ one should be pretty much ignored and the reality that we can only report information of value once a month recognized. A recent C+ result is no less shit for being a day old.

I'll illustrate. I am VERY interested in polling about Iowa. But the relatively recent poll by C+ rated Change Research that has Warren way up by 11 there does not inform. The older A+ rated Monmouth Biden +11 and even the much older Selzer Biden +8 are still the placeholders for where that race is at despite the fact that the Change Research result is far more recent.

Recent shit still stinks as bad. I'd love more recent quality polling and the confidence that Selzer and Monmouth accurately capture what is today is decreased. But a C+ poll is still worth near zero, so until a new Selzer or another highly rated house posts all I can say is we don't know if it has changed or not. The C+ house's result does not impact that assessment hardly at all.

I know you don't get that but oh well.
  #100  
Old 09-06-2019, 11:13 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,357
Quote:
Originally Posted by DSeid View Post
(I can't help myself.)

[Deleted to not get warning] ... in a hypothetical universe in which there were only two pollsters, one A+ reporting monthly, and one C+, reporting daily, the C+ one should be pretty much ignored and the reality that we can only report information of value once a month recognized. A recent C+ result is no less shit for being a day old.

I'll illustrate. I am VERY interested in polling about Iowa. But the relatively recent poll by C+ rated Change Research that has Warren way up by 11 there does not inform. The older A+ rated Monmouth Biden +11 and even the much older Selzer Biden +8 are still the placeholders for where that race is at despite the fact that the Change Research result is far more recent.

Recent shit still stinks as bad. I'd love more recent quality polling and the confidence that Selzer and Monmouth accurately capture what is today is decreased. But a C+ poll is still worth near zero, so until a new Selzer or another highly rated house posts all I can say is we don't know if it has changed or not. The C+ house's result does not impact that assessment hardly at all.

I know you don't get that but oh well.
But that's not how Nate Silver does it.

Here is Nate's Iowa forecast for Nov 2016.

The highest weighted poll is indeed A+ Selzer & Company with a huge 3.84 which was in the field Nov 1-4. However, the second highest weighted poll is C+ RABA Research at 2.43 in the field Nov 1-2 followed closely by C- Survey Monkey at 2.41 in the field Nov 1-7.

They are well ahead of:
A- Quinnipiac University 2.01 Oct 20-26
A- Ipsos 0.68 Oct 17-Nov 6
A+ Selzer & Company 0.57 Oct 3-6
A+ Monmouth University 0.25 Sep 12-14

In particular note the decay of the Iowa gold standard Selzer poll that is a month old. Fresh C+ data has four times the weight of month old A+ data.
Closed Thread

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 10:00 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2019, vBulletin Solutions, Inc.

Send questions for Cecil Adams to: cecil@straightdope.com

Send comments about this website to: webmaster@straightdope.com

Terms of Use / Privacy Policy

Advertise on the Straight Dope!
(Your direct line to thousands of the smartest, hippest people on the planet, plus a few total dipsticks.)

Copyright © 2019 STM Reader, LLC.

 
Copyright © 2017