How does Facebook, Google, etc., take revenue from publishers?

In the March Harper’s Magazine, their feature Harper’s Index using various numbers to report trends says that Google and Facebook are responsible for losses in the billions to publishers because they are using publishers’ content.

Minimum estimated annual revenue U.S. news publishers miss out on as a result of Facebook’s use of their content

$1,900,000,000

As a result of Google’s use

$10,000,000,000

Source: *Anya Schiffrin, Columbia University (NYC)

IME those sites are just providing links to content, which you then must follow to the source site. They are not republishing the content, except in some cases where Google will summarize a snippet when it thinks you are searching specifically for that information, so you get your answer without having to click a link. But it seems that hardly is a substitute for a paid subscription to the source.

What am I missing?

I may venture to assume some, that:

A) Some website are very popular.
B) Some people working in advertising, UX, web design, etc. are quite good at what they do.

I don’t know if that explains much but I’m thinking it may be some combination of effective enough advertising and enough desire for advertisers to want to pay for “clicks” or time spent on the website or whatever space they’re ultimately paying to be seen in, along with the obviously sheer popularity of google and facebook being at play here.

Now, I think that with this much weight, there may be something that’s getting away from you. In my experience in hanging out online, ownership of even smaller entities and at that the staff of even places like bulletin boards, static forums, what have you have followed suit.

I think perhaps with commercialization on the level of something like facebook and google, not considering smaller websites like places that can have some kind of nuance so to speak too (think 4chan, wikipedia), there may be something sophisticated going on that not everyone can see, to put things simply. Whether there’s corruption at play I don’t actually know.

The potential flip side to my suspicion of corruption or what have you here brings to mind a scene from Ghost in the Shell, where Motoko’s childhood friend Kuzo was using some kind of advanced hacking technique to collect revenue that had been gone to the wayside and found it’s way trickling down down from the operation of some huge organization, where he simply collected it without negative or positive consequence, making him insanely rich.

I’m not at clear on what you are trying to say here.

The article claims that Facebook and Google are using publishers’ content. They are not, as far as I can see. They are providing links to that content.

I am not asking why people are not spending more time on publishers’ web sites.

They are certainly using content. Lots of stuff is completely readable on both Facebook and Google.

I think it’s extremely unlikely the publishers would have gotten all those eyeballs without Facebook and Google, but i understand where they are coming from.

I haven’t seen any content republished on Facebook unless it was posted by the publisher themselves. What kind of content are you seeing?

That’s it. Those are the cases when Google is republishing the information without permission. At the very least they want the user to click a link in all cases.

An aggregator will publish a headline and often a few sentences and an image. Many people are satisfied reading the headline and moving on. In the case of news, there often isn’t much new content beyond the first paragraph. In the case of humor, like The Onion, most of the value is in the headline.

The case for articles with actual content is less obvious. If you’re scanning titles for an article to read on an aggregator, then you are seeing the aggregator’s ads. If your doing on the website, then you are seeing their ads.

I think boils down to the notion that content creaters want more control of how their stuff is made available. It’s the age old tension of publicity. The difference now is that it’s monetized and there are metrics and logs that can be presented in court. Everyone is affiliated, tracking, referred, served, etc.

For better or for worse, the FANGs and them short circuit some of that control.

I’ve seen aggregator sites, but I don’t see them in my Facebook feed or Google searches. I’m wondering how they get eyeballs.

Australia and Canada (and perhaps other countries) have recent legislation requiring aggregators like Facebook to pay something resembling fair value for being able to populate their sites with news agencies’ hard work. The argument is that they fill column inches with content paid for by news organisations without contributing to the process of journalism. A figure I heard yesterday was that the Australian Broadcasting Commission’s [state-owned like the BBC] payment share from last year was enough to keep several dozen journalists on board. And something similar for the newspapers and news organisations of record. And Australia is a small news market so the Harpers figures may be credible.

The broader issue is that Facebook is not just a news stand where you come to browse the papers and choose which to buy. More people rely on social media to provide this browsing buffet of headlines that becomes their exposure to news and current events, and brings in users and supports Facebook’s advertising. If the quality journalism that underpins that slowly goes broke and dies, then Facebook will happily feed garbage journalism, conspiracy theories and celebrity shit.

Governments are struggling with how to sustain journalism as one of the pillars of democracy with the slow death of print media. Pulling money back from Facebook and Google in this way to support it is clunky, but may be necessary to keep its pulse going until a better model comes along.

That used to be true, back in the day when news articles had headlines. But it’s been years since I’ve seen one of those online. Modern articles don’t have headlines; they have hooks. A headline is a brief summary of an article, something like “Serial killer found guilty on all counts”. A hook is the opposite, carefully crafted to give as little information as possible, like “Verdict in serial killer case: Here’s what you need to know”.

In many/most cases, it is the publisher itself that’s providing the summary. So it’s rich when they complain about Google showing information that the publisher went out of their way to provide.

For example, here’s a portion of the metadata provided at the top of an NYT article:
<meta data-rh="true" name="robots" content="noarchive, max-image-preview:large"/><meta data-rh="true" name="description" content="President Vladimir V. Putin vowed to punish those responsible for the assault, one of the deadliest in Russia in decades. U.S. officials have attributed the attack to ISIS-K, a branch of the Islamic State that has been active in Iran and Afghanistan."/> <meta data-rh="true" property="og:url" content="https://www.nytimes.com/live/2024/03/23/world/moscow-shooting"/><meta data-rh="true" property="og:type" content="article"/> <meta data-rh="true" property="og:title" content="Death Toll Rises to 133 in Moscow Concert Hall Attack"/>

The entire purpose of this metadata is for Google, Facebook, etc. to use alongside the news links. It’s completely invisible otherwise. The publishers provide it because they want their content to show up on search engines and social media sites. But they also want it both ways–they want both the benefit of being linked to and to be paid for the summary info that they are already providing.

Maybe in some cases there’s some unauthorized scraping going on, but that’s not the case for the big publishers. If they want Facebook, etc. to not show a summary, they can choose to not provide the metainformation.

By aggregator, I meant Facebook. Don’t people repost and share news and articles in their feeds? Aren’t most posts just article links or memes with an optional comment?

The problem that I understand the publishers complain about, is that the main sites, Google Facebook etc., manage to capture the entirety of a member’s interests and serve them with targeted ads, before they even click through to the publisher site (the newspaper or magazine website). And as mentioned, often the link summary (or commentary provided by other facebook users) is sufficient that users simply read that “headline” and don’t go to the site.

This is why Facebook, for example, was (is?) blocking links to Canadian publishers, even if those links (typically) are posts by ordinary facebook users. They’ve figured out, far better than the publishers have, how to monetize their users. So, they get all the ad revenue. meanwhile, the typical Canadian newspaper is paywalled, thus guaranteeing I won’t stay. So if I go to that site at all, I only peruse the front page headlines and short summaries.

There are websites - you’ll know them when you see them - that have obvious hooks. “The episode of Beverly Hillbillies that was never shown” or “Doctors hate this one trick” or “Top 10 bloopers on the news”. The top 10 bloopers are one per page, nothing unusually spectacular, and each page is nothing but assorted ads and a small amount of content, and you need to click to the next page for the next item… each click is an ad viewed and revenue for the website.