Can all information be found online?

A hash still doesn’t add any meaningful security for a four-digit PIN, because it’s trivial to just try all 10,000 PINs until you find the one that matches the hash. Hashes are only relevant when you’re hashing passwords, which (due to greater length and larger alphabet) can be much harder to search exhaustively. Well, at least if the password was chosen well, which most of them aren’t, but security is at least possible.

And even if data is digitized, it might still never make it online. When I was an undergrad, the astronomy department had huge racks of paper tape of old data, that had never been transferred over to a new medium, and the machines built to read it were all scrapped. Every so often, a student would start some effort to get those read by improvising a scanner to do the job, or the like, but (being undergrads) never got the chance to finish the job. Eventually, all those racks of data just got thrown out. This sort of thing doesn’t happen any more, now that people realize that keeping media updated is something that needs to be done, but it was pretty common back in the day.

For several years my grandfather worked as an ice man. He worked at an ice manufacturing plant in San Pedro CA, which delivered blocks of ice to residents. When my mother died I inherited a couple of regional industry newsletters. I enjoyed reading the articles talking earnest trash about the perils and inefficiencies of refrigeration. I posted scans on a blog. I’ve never been able to google a link to them directly. I can google to my blog, because I know its name, and then search there, but googling directly to the newsletters is effectively impossible.

It doesn’t help that the name of the newsletter is “Ice Picks”.

I have all of them! :slight_smile:

I was a librarian for 30 years (retired now). From the mid-2000s on, I had lots of conversations that went like this:

Person: “What’s it like being a librarian now that Google* has digitized everything?”
Me: “Google hasn’t digitized everything.”
Person: “Yes, they have. I read it somewhere. Have you started looking for a new career yet?”

*Sometimes it was the Library of Congress instead of Google.

Oh, and don’t get me started on the people who claim to have a USB drive containing “every song ever written.”

I was talking with an old friend of mine from Texas last night about Texas music. He asked me if I remembered the song, “Blah Blah Blah”. I’d heard it before so this morning I tried googling it. I found this thread: http://boards.straightdope.com/sdmb/showthread.php?t=588560 from 2010. I tried find a way to buy it online but didn’t have any more luck that the OP. So the internet doesn’t have that…

ETA: It’s this one: Blah, Blah, Blah Chords - Sam Oaks - Cowboy Lyrics

Other searcher: Yahoo | Mail, Weather, Search, Politics, News, Finance, Sports & Videos

Google has certainly made it a goal of theirs to digitize everything and get it all online. And they’ve made quite impressive progress. But they’re still not remotely close to done, and won’t be for a very long time, even if you only count public and semi-public information like books and newsletters. And there will always be some material, like most love letters, which will never be digitized.

They may have made it their goal to digitize everything, but they certainly have not made it their goal to make it all available online. I frequently run into situations where they certainly have the book digitized, but only offer it in a “snippet” view, or when, after going through a few pages, I find a block that says that the rest isn’t available.

I’m not just talking about recent things still under copyright. I frequently am frustrated by old, long out-of-copyright books that GoogleBooks witholds from me, although my search engine (usually Google itself) tantalizingly tells me that they have something with one of my keywords. It makes me want to strangle them.

You should put them on archive.org.

Is the OP kidding? All information is not even close to being posted online. Puh-leese. Put the laptop down and get with the real world.

Seconded.

And if you are a librarian, curator, or otherwise in charge of a valuable collection, and a corporation like Google comes to you with a generous offer to come in, ransack your stacks and digitise everything for free, please make triple-sure, in writing, that the license terms require them to make all derived materials openly available to you, your patrons, and the general public in perpetuity.

Not everything is online. Some of the research papers that I’ve published are online, but others are not. This seems to be more a factor of where and when they were published rather than the subject material.

Some professional literature in the sciences have been digitized and others have not.

Lots of stuff is kept offline because it is secret or because nobody has ever bothered to digitize it. But even in this day and age, there are things that are not hopelessly obscure that seem to have no internet presence.

Ike was the president of Columbia University after the war. I have long maintained there are no images of the interior of the president’s residence on the internet. (I am really looking for the interior during Ike’s tenure.) No reason it is not online. It just isn’t.

There is no central database to search firearm serial numbers or ownership. Many of the NICS forms are available online to the BATF in PDF form, but they are not digitally searchable.

Just to add a pet peeve. I’m something of an Apollo missions tragic. I love to read the old reports and papers surrounding the programme. NASA published a huge volume of work, detailing pretty much everything you want to know. Most of it only resides on paper. There are scans of some documents available. NASA itself has a web page that provides links to many. And many of these links are to scans made by private individuals that took the time to go to the libraries where the now aged documents reside. Many of these scans are pretty mediocre, often with essentially impossible to read figures and barely discernable pictures.
Not just these documents, but the blueprints to most of the systems reside on microfiche. Again essentially impossible to access. With the 50th anniversary looming there is going to be another lift in interest in the moon landings. It would be brilliant to see a lot more of the core information made available, but there is also no doubt, it isn’t a small undertaking to take a small library’s worth of publications and to curate a digital on-line version.

Rather than Google, maybe someone could convince Elon or Jeff to fund it.

But this is just my little obsession. The world is filled with such repositories. What is important is that much of the information in these repos is high quality. It isn’t the masses of barely curated detritus of modern life.

archive.org will digitise microfilm/microfiche as well as books. I am sure they would be happy if some billionaire dropped extra money on them.

Don’t blame Google for all that information not being available to the public–blame the Writer’s Guild of America for throwing a hissy fit to stop them.

My apologies to Google if they tried to do a good thing but got their arm twisted through no fault of their own. Is there anything they could have done in advance to avoid such problems, or anything they can do now to resolve them?

Thanks. I’ll check them out. If they’re interested in PDFs of just two newsletters, that would be cool.

No interest or permission needed, you can just post it.

If you read the articles on wired’s website about this topic, it comes down to a Federal judge blocking the agreement between the writer’s guild and google. I do hope this problem can be solved, because it would enable much better access to the information.

I wonder if google can use all their scanned books (a significant fraction of every book ever published in English) to train an artificial intelligence.

It would not be illegal for the AI to answer questions based on those books, in the same way it is not illegal for me to read a book and then tell you any facts I know from it and make short quotes from the book.

Of course, google was going to take it a step further. You would have been able to purchase, for a minimal cost, access to the full text of any book ever published from your computer.