Idea for website—please shoot down?

Roger_That · July 12, 2021, 4:34pm

I’ll often come across sites, usually on Facebook, that I’ve learned are terminally buggy, ad-ridden, clickbait, wastes of my time, etc. to open, typically labelled “promoted” or something of the sort, and I was wondering why someone couldn’t extract the content of these sites and render them in the usable form that people want them to be in. (Of course this person would need to be much better than I am at navigating websites, copying material, and above all at preventing malware and such from infecting his computer.) If this is possible (I’m not sure it’s as easy as I think), is it legal? If so, it would be great to open these buggy wastes of time, and peruse a bunch of these websites safely. Can you tell me why this is an unfeasible idea? Thanks. I’m thinking it would be a popular website and a service to those of us who know better than to open up one of them.

GreenWyvern · July 12, 2021, 5:04pm

There are adblockers for browsers that already block ads, pop-ups, and other content you don’t want to see. Try uBlock Origin.

There are also a number of browser add-ons specifically for Facebook, Twitter, etc. that will allow you to customize the content.

Telemark · July 12, 2021, 5:17pm

Why do you want to open these sites again? They’re full of spam, threats, or crap.

AlsoNamedBort · July 12, 2021, 5:22pm

“Clickbait” by definition has no useful content. Why would I want to extract worthless content? I’d rather block it.

There’s a subreddit dedicated to terrible clickbait called Saved You A Click that basically does the same thing. Spoiler alert: It’s all garbage.

Stranger_On_A_Train · July 12, 2021, 5:54pm

Well, you can start here for the extraction part:

The legal issues are fraught; Facebook and other social media sites make information on posts or pages public which is protected by the End User Agreement (as is their agglomeration and sifting of data for personal information to be sold to third parties); if you are surreptitiously scraping data and republishing it, you are almost certainly violating copyrights even if information is not explicitly stated.

As for “rendering them in a usable form”, that is trivial, requiring no coding, scraping, sorting, or filtering: You’re welcome,

Stranger

naita · July 12, 2021, 8:14pm

I’m not sure what your idea even is. For most “real content” you can find on bad websites there already are good website with the same content. Do you have any examples of what you would like to see available on a better site?

The kind of content that you can exclusively find on terrible websites is usually of dubious legality and/or something people aren’t willing to pay for. A lot of the brokenness of those pages is about trying to make a buck out of the visitors.

ZipperJJ · July 12, 2021, 8:44pm

Aside from what everyone has already mentioned - namely that the content is garbage - the content is not usually the original source, either. I don’t generally click through to click-bait type links, but I do occasionally click on stories that are posted on random US-based local news stations. Most of the time, there’s a lot of popups and auto-play videos, and it turns out the source is something like the AP or NYT or another city’s local news.

I’ve used professional scraping services before, for a legit reason. It was not cheap, nor was it fast and easy. I would hate to spend my scraping credits on shit stories that are re-writes of free and legit sources.

Also once the sites figure out you’re scraping…they’ll block your scraper service.

Roger_That · July 12, 2021, 9:05pm

Usually gosspy, celebrity garbage that I get interested in seeing --“Which movie stars own gigantic ranches in Wyoming?” or “Famous singers who have never been married,” that sort of crap–that I might click on but I see it’s “promoted” and I’ve been there before and wanted those five minutes back. The content might be summed in a list, or a few paragraphs at most, but I know those sites are full of crap, in every sense of the term, so I don’t click, but I think sometimes that an aggregator of these sorts of sites, with cleaned up delivery, might be successful. Not for me–I don’t have the skills or the interests, but I wondered why someone with those skills hadn’t tried it.

Elmer_J.Fudd · July 12, 2021, 9:12pm

I get it. Sometimes I do want to see “25 images of bra-less TV actresses from the '70s”, but I don’t want to click through 25 pages with a single image each and 25 pages of just ads to see them all.

Roger_That · July 12, 2021, 9:19pm

Exactly. I was wondering if there were legal or technical obstacles to someone aggregating such sites.

Jackmannii · July 12, 2021, 9:26pm

If you really want to know why you’re supposed to cover your side view mirrors with plastic bags or pour salt down the drain and don’t want to scroll through a ton of clickbait, try posing the question on Google. Many times a site like Snopes will provide the answer.

iamthewalrus_3 · July 12, 2021, 9:35pm

Yes. Copyright law is the primary legal obstacle.

digs · July 12, 2021, 11:00pm

This. But often, even if you find the info on a “good” website, it turns out to not be worth the time and energy.

I’ve occasionally been tempted to click on a clickbait “story”, so instead I googled it in hopes of finding a non-slideshow/non-ad barrage/non-trashy version of it.

But then I realize that I don’t need to know about one weird trick, or celebrities’ private lives… Hey, I’ve got better things to do, and it’s just because I’m lonely and bored and tired that I’m even clicking on drivel. (And it’s those times that I lose the ability to discern drivel from good stuff).

.

Just say no to drivel.

not_what_you_d_expect · July 13, 2021, 12:20am

I have to admit, I’m really curious about the “Once famous star now works in Placerville” one. I mean, I’m only a little way from there, I could go say hi!

But I know better than to click on that stuff.

CookingWithGas · July 13, 2021, 12:35am

You wouldn’t have to copy it, just hot-link to the heart of the content. That’s sort of a gray area for IP rights, IANAL.

Roger_That · July 13, 2021, 12:36am

Yeah, me too. Once in a while. On a slow day. But I don’t.

Is it really illegal to start a website that purports only to cite clickbait sites as sources (“According to getchermalwareheah!.com, Tom Cruise and Penelope Cruz own Carnival Cruiselines”)

Duckster · July 13, 2021, 1:05am

The SDMB used to regularly be cloned.

blue_infinity · July 13, 2021, 2:32am

This website removes clutter for news sites.

chappachula · July 13, 2021, 4:43am

There exists a site called
www.outline com
which does something similar to what the OP suggests.

You cut-and-paste a webpage address, and it shows you a text-only version of that page.

Ponderoid · July 13, 2021, 4:55am

Try https://deslide.clusterfake.net/ for getting all images together on one page from a slideshow-like presentation.

Topic		Replies	Views
Boo hoo hoo. I'm sorry I'm 'stingy' because I won't let you steal my copyright! The BBQ Pit	35	3408	April 25, 2002
Just because it's in the net doesn't mean it's free! The BBQ Pit	58	4536	October 20, 2006
dvd screenshots and copyright violations Factual Questions	24	14025	August 6, 2010
insecure web designers and no right-click The BBQ Pit	56	4234	June 2, 2005
At what point does URL hyperlinking become unauthorized reproduction of someone else's content? Factual Questions	33	3698	June 13, 2018

Idea for website—please shoot down?

Related topics