Reason why Trump's approval ratings are falling?

So when you say “all other things held equal” , you meant something other than all other things held equal?

I meant in aggregate. As I have shown and explained. I’m not sure how I could have made that clearer.

(I really need to work on my self-control and stop engaging with silliness like this.)

“Partially” is a no duh of course. They are part of the 538 aggregation and have some, very marginal, pretty infinitesimal, impact.

Let’s for now ignore all the highly rated houses reporting less than weekly (even though their weights are huge, decrease slowly, and there on average two putting fresh fully weighted data into the aggregate weekly).

We have two C+ rated daily trackers. Low weights and with weight decaying very quickly.

We have one B rated daily tracker. Moderate weight decaying slowly.

And we have one B- rated and one B+ weekly tracker. Average them as B, as moderate weight decaying slowly.

Again, completely ignore all the various A+ to B+ houses popping fresh data with high weights that decay slowly but that post less than weekly, and, unless I’ve missed one, we have five trackers that report weekly or more, three averaging B and two C+.

If all but one of the C+ ones stay flat over a month, how much would the C+ one have to move to cause the aggregate of them to move 2 points?

Well we don’t actually know how little 538 weights a C+ poll and how quickly it decays relative to a B one … so let’s play more with hypotheticals.

The limit would be that 538 actually weights the C+ one exactly as much as the B ones, right? They don’t but given that absurd extreme assumption, counting all five of these as the same weight, and looking only at these weekly to daily polls … one of five equally weighted would need to move ten to move the average of that restricted group by 2 if all the others were static.

If the result of being a lesser weight that decays more quickly is to decrease the impact of the C+ one to half the effective weight of a B one, which is likely more in the ballpark, then the one C+ house would need to move sixteen to move the needle by that much.

Meanwhile of course on average about two B+ or higher, much more heavily weighted and slower decaying, houses that each report less frequently than weekly, drop results each week … (For example in the last two weeks new numbers from two A- rated and two B+ rated houses, and another A- and an A+ one in the two days before that.)
I think it is your “in aggregate” that is messing you up. Your “in aggregate” has nothing to do with 538’s aggregation and is immaterial information. You are basically counting a lot of crappy Harris polls that, like Rasmussen, have little input into 538’s outputs, and some decent YouGovs, and then a few others of variable quality that pop in that you do not correct by house effect, which adds up to nothing but self-cancelling noise landing as flat line.

If Rasmussen was not included in the 538 tracker the tracker line would virtually (but not exactly) the same as it is now.

Likely false in this particular case. And certainly false as a general rule.

Your simplistic aggregate is not “all other things”. Is that not clear?

One other item to note from all of the above - one should not look to the 538 aggregate to assess the immediate impact of a news event.

Say there is some new Trump idiocy that disgusts many - even all three daily trackers moving the same direction the same magnitude will move the 538 needle only so much. It has to be lasting enough to get the weekly trackers and maybe a new result from a highly rated house or two that was in the field after the event reporting out until the 538 needle will start to show it. Today’s number, for example, do not yet reflect any impact, *if *any, of his handling of Hurricane Dorian.

July 14-31, Gallup had Trump at 42% approval. Rasmussen had him at 48% July 28-30. On Aug15-30, Gallup was at 39%, Aug 26-28 Rasmussan had him at 46%. So over the last 40 days, a higher rated pollster had a bigger drop than your bogeyman. In fact, Gallup had a large jump in the disapproval number so the net approval was an even bigger move. Try running your bot on Gallup and tell me you don’t see the same “evidence” that Gallup is "partially responsible " for the 538 graph movement.

My simplistic aggregate is precisely all other things.

Why don’t you do that? I could help you if you really wanted to find out. I suspect you will be disappointed with the results.

Here’s why… the approval number in the 538 model drops from 42.5 to 41.3 from Aug 1 to Aug 30 with no new information from Gallup during that period. The 538 model doesn’t slope downward in anticipation of a drop in the Gallup poll. Between the release dates of the two Gallup polls we have one decaying Gallup data point. I suppose if it was high enough to pull the model up its decay could contribute to a decline in the model.

That said, I didn’t say anything at all about Gallup before this post. Deeper analysis may indeed reveal that they are a contributor to this particular drop. That doesn’t really change anything I said about Rasmussen. Also Rasmussen is not my bogeyman. I’m not sure where that comes from.

Here is the code (Python 3) with the parts that post it to my Twitter removed. Feel free to adapt it to your purposes and report back here. Have fun.


import pandas as pd
import datetime
#Get data
url = 'https://projects.fivethirtyeight.com/trump-approval-data/approval_polllist.csv'
df_raw = pd.read_csv(filepath_or_buffer = url, parse_dates = ['startdate', 'enddate'])

# Drop columns we don't need
df = df_raw.loc[:,['pollster', 'startdate', 'enddate', 'approve', 'disapprove']]

# Add net job approval column
df['net'] = df['approve'] - df['disapprove']

# Add average date column
df['avgdate'] = df['startdate'] + (df['enddate'] - df['startdate']) / 2

# Restrict to only columns of interest
df = df.loc[:,['avgdate', 'pollster', 'net']]

# Split into Rasmussen, non-Rasmussen
df_ras = df[df['pollster'] == "Rasmussen Reports/Pulse Opinion Research"]
df_nonras = df[df['pollster'] != "Rasmussen Reports/Pulse Opinion Research"]

# Helper function. Compute average of 'Net' column for rows within delta days (+-) of thedate
def rollingavg(thedate, thedf):
    days = 7
    delta = datetime.timedelta(days=days)
    temp = thedf[(thedf['avgdate'] >= thedate - delta) & (thedf['avgdate'] <= thedate + delta)]
    return temp['net'].mean()
  
# Create final dataframe
min_date = df_raw['startdate'].min()
max_date = df_raw['enddate'].max()
df_final = pd.DataFrame(pd.date_range(min_date, max_date), columns = ['Date'])
df_final['Rasmussen'] = df_final['Date'].apply(lambda x: rollingavg(x, df_ras))
df_final['nonRasmussen'] = df_final['Date'].apply(lambda x: rollingavg(x, df_nonras))
df_final['Difference'] = df_final['Rasmussen'] - df_final['nonRasmussen']
df_final = df_final.set_index('Date')

#Do things to separate head and tail for plot

df_solid = df_final.tail(90).head(83)
df_dots = df_final.tail(8)
df_dots = df_dots.rename(columns={"Rasmussen": "Rasmussenx", 
                                  "nonRasmussen": "nonRasmussenx", 
                                  "Difference": "Differencex"})
df_combined = pd.concat([df_solid, df_dots], axis=1, join='outer')

#Make plot
aplot = df_combined.plot(title = 'Trump Approval (last 90 days)',
                         color = ['r','b','g']*2,
                         style = ['-']*3 + [':']*3)
aplot.legend(['Rasmussen', 'non-Rasmussen', 'Difference'], frameon=False)
aplot.tick_params(labelbottom=True, labeltop=False, labelleft=True, labelright=True,
                  bottom=True, top=False, left=True, right=True)
fig = aplot.get_figure()

You started a bot to record how far off Rasmussen is from the average. Maybe it’s not your bogeyman but that is an odd amount of attention to pay to a particular pollster.

And no, I’m not going to set up a database to test this. The last 40 days have had a meaningless blip of change, I don’t need to dig for answers to a meaningless blip.

Rasmussen seems to get mentioned an odd amount by a particular president so from time to time I would check how they were doing doing compared to the field. I knew they had a republican lean, but I didn’t know if that lean was consistent. One of the times I was looking in to this was around the time I was looking for a project to use as proof of concept for some sort of data visualization connecting a data source to my Twitter account using the Twitter API. I choose the 538 approval database as data source and use that to visualize a question that happened to be on my mind that day. I spent about two hours writing this bot back in January and look at its output once in a while when Rasmussen is trending for some reason or other.

No one asked you to set up a database. I handed you the code. You seem to be paying an odd amount of attention to something you find meaningless. Other people find this topic interesting. Does that bother you for some reason?

I do find the topic interesting. I just think you are talking nonsense about the topic.

That’s unfortunate.

You could have taken this opportunity to learn something data modeling, using smoothing functions to reduce the effects of noise, time decay of data points, and even some pandas dataframe basics.

You chose to go a different way.

No, I learned some stuff. Thanks, DSeid!

Two days ago DSeid believed that no rational analyst would time decay data.

Carnal K you are welcome. It really is not worth engaging with the silliness that Lance Turbo is posting on this. We can use our time more productively … say by engaging with the guy muttering on the street corner.

You mad bro.

You can admit it. You didn’t consider time decaying data before, but you included it in your last analysis. You learned something. You’re welcome.

Your failure to include time decay of data led directly to several erroneous conclusions earlier in this thread. Now that won’t happen again. This new knowledge is a difference maker.

(I can’t help myself.)

[Deleted to not get warning] … in a hypothetical universe in which there were only two pollsters, one A+ reporting monthly, and one C+, reporting daily, the C+ one should be pretty much ignored and the reality that we can only report information of value once a month recognized. A recent C+ result is no less shit for being a day old.

I’ll illustrate. I am VERY interested in polling about Iowa. But the relatively recent poll by C+ rated Change Research that has Warren way up by 11 there does not inform. The older A+ rated Monmouth Biden +11 and even the much older Selzer Biden +8 are still the placeholders for where that race is at despite the fact that the Change Research result is far more recent.

Recent shit still stinks as bad. I’d love more recent quality polling and the confidence that Selzer and Monmouth accurately capture what is today is decreased. But a C+ poll is still worth near zero, so until a new Selzer or another highly rated house posts all I can say is we don’t know if it has changed or not. The C+ house’s result does not impact that assessment hardly at all.

I know you don’t get that but oh well.

But that’s not how Nate Silver does it.

Here is Nate’s Iowa forecast for Nov 2016.

The highest weighted poll is indeed A+ Selzer & Company with a huge 3.84 which was in the field Nov 1-4. However, the second highest weighted poll is C+ RABA Research at 2.43 in the field Nov 1-2 followed closely by C- Survey Monkey at 2.41 in the field Nov 1-7.

They are well ahead of:
A- Quinnipiac University 2.01 Oct 20-26
A- Ipsos 0.68 Oct 17-Nov 6
A+ Selzer & Company 0.57 Oct 3-6
A+ Monmouth University 0.25 Sep 12-14

In particular note the decay of the Iowa gold standard Selzer poll that is a month old. Fresh C+ data has four times the weight of month old A+ data.