Please explain "cloud computing" to me?

When the buzzword-term “cloud computing” (and its siblings such as “in the cloud”) first came onto the scene, I was under the impression that it was about handing off intensive computation tasks to a large array of computers and have them gang up on the project, instead of subjecting your local computer to the load. Kind of like connecting to a mainframe back in the old days.

You know, like getting the average temperature each minute for each of 10, 000 reporting stations for a year’s worth of data and computing the diff of each from the average and computing the variance and mapping the hot and cool spots and construct a motion algorithm to explain how the hot and cool spots move over time.

But as actually used, it seems to be no more than online storage of documents and (occasionally) the use of a hosted computer’s version of some common applications (Word and Excel for instance) that you can use in lieu of creating and maintaining files with your own local programs. (e.g., Google Docs).

Is that an accurate impression? Is there a lot of “cloud computing” going on that is other than “store your files here instead of on your own hard drive” + Google Docs?

It refers to all those things. Basically, everything that used to be called “software-as-a-service” is now called “cloud computing.”

Next up: nebula networking[sup]1[/sup].

[sub]1. I just invented that so you have to pay me if you use it.[/sub]

It’s basically a flexible infrastructure setup that allows you to quickly and easily setup/use computing resources based on some standard configurations available.

Instead of having your own set of servers and needing to add computing power to match demand you can just pay for additional and incremental capacity that is managed by a 3rd party.

It is largely a marketing buzzword, but there are a few underlying technologies:

[ul]
[li]Virtualization – it is possible to have a big server emulate several little servers so you can spawn some servers on demand as you need them instead of buying a lot of hardware, or conversely take an image of a program set up on a small server and port it seamlessly to a big server so it can run quicker.[/li][li]“Big Data”/NoSQL – there are systems to set up very large datasets in a way that permit access more efficiently than traditional databases would at that size.[/li][li]MapReduce – a programming model for dealing with very large datasets.[/li][/ul]

It’s not just “store stuff online”, even if that is the most obvious entry point.

Instead of losing your hosted data from the failure of one server, you can instead lose it from distributed servers all over the world.

I agree with all the above but the key thing is “the stuff is up in a ‘cloud’ somewhere, which means you as a user don’t care about how it’s physically implemented.”

You can see what the U.S. government thinks it is here.

There’s definitely some theoretical value to the “cloud” concept. Quasi-centralized location of your data storage means you can get access to it from any computer anywhere in the world – Just like web-based e-mail vs. local-based e-mail. Reputable major cloud service providers would, presumably, offer guaranteed 100% up-time, with redundant servers and redundant data storage, and regular back-ups done automatically for you.

It also means the service providers have total control of the fate of your data. Did your monthly payment get lost in the mail or something? Do you have some dispute with the company? Are you absolutely confident of their security measures to protect your data from hacking and theft? What if your provider goes out of business? Are they going to scan your data to glean all the info they can get about you (or your company), to add to their massive data profiles on you, the world, and everything (like Google does with your e-mail)?

At it’s most cynical, “All your data and biz plans are belong to us!” – The big Mi$creant scandal of 2001, in which an over-zealous Mi$creant lawyer wrote terms of service that seemed to give Mi$creant total ownership of ANY data that ever passed to, from, or through any of their servers. This led to a major world-wide shit-storm, in which numerous ISP operators black-listed all known M$ sites for a while.

Cloud is definitely a big buzzword in the IT industry today. It can be used to refer to both computing, with virtual servers, computers, and other infrastructure, as well as cloud storage. There’s potentially some huge benefits to a lot of it, but also some real drawbacks.

One of the biggest benefits is that, properly implemented, it can take advantage of efficiencies of scale. If I need a workstation, I might also need, say 100-200GB of disk storage, 4GB of RAM, a dual processor, etc. However, at any given time, I’m probably only using so much of that. Thus, by combining all of those resources into a giant pool, I can reduce the need on a per-user basis. Further, there’s concepts such as dedupilication which can improve data density even further.

High density and centralized locations also allows for other benefits based on scale that are impractical otherwise. For instance, it’s expensive, both in terms of money and resources, to back up my computer at home, but backups can have dedicated and specialized resources in a cloud environment, so I could perform backups with less or nearly zero impact on critical system performance. You also get redundancy, where a burned out motherboard on my computer means I’m out of commission until I get it replaced, I wouldn’t notice as a user in the cloud.

Another huge benefit is that the computing and data is accessible remotely. Potentially, I could save data at work and access it from home securely. Or I could start an intensive computation and monitor it from my phone.
As others said though, there are huge draw backs. Whoever is doing it NEEDS to be competent and you pretty much have to trust them absolutely with your data and availability. When I’m using a VM, if I lose network connectivity, I’m completely hosed whereas, if I have a physical machine, I can at least work offline. What happens if they get hacked, what assurances can they give me on my data? I’m unlikely to get targetted for an attack as a random home PC user, but data centers are obvious targets because they could potentially get data from thousands of users simultaneously.

Well, it’s a lot more than a marketing buzzword, because a lot of technology goes into providing the flexibility.

The idea is, we have a set of physical resources, which we can apply almost willy-nilly to a set of customer requests. The customer requests can be for any of the following:

data storage
a virtual computer to call their own
processing

The first and last you already referred to. The second is that you’re given “your own computer” to use, and it seems to you like you own the workstation; you can do stuff as the root user, you can manage it, you can run apps on it, etc. But that computer doesn’t actually exist, it’s actually a “job” that any available server can do the next work unit on.

As a data communications guy, the part that really blows me away is that you can even have a virtual communications network. That is, you can connect a bunch of virtual servers up with virtual network connections, and manage those virtual network connections to some extent.

Of course, the data communcations is actually happening on real routers. But it’s all dressed up (with a pretty significant bit of technical wizardry) to look like the virtual picture of the network.

The next step, of course, is to do all this without hardware. *After all, if your software is good enough, it shouldn’t need hardware. :wink:

  • As it turns out, this is true only for analytical problems!

We’ve been doing this for years. Any big job I want to run I can submit to our internal cloud, made up of thousands of processors - it runs on some machine it chooses (though you can ask for specific configs) and gives you the answer. If it goes down for some reason (unlikely) it gets resubmitted to another machine and you never know.
For normal work I get assigned a server, a different one each time, and my session picks off from where I left it. If I travel from California to India I can pick it up as if I were at my desk, and my laptop has an emulator which lets me pick it up from home.

As for the negatives, I agree, but when I was using a normal PC at my last job I made so much use of shared drives that I was just as hosed if the network went down as I would be using a cloud. Plus I’m cynical enough to think that the average cloud is better maintained and protected than the average PC. Look at the number serving as bots without their owners don’t even knowing.

One example of cloud computing service is the OnLive gaming platform. The game is rendered on a server somewhere and the video is streamed to and from your TV. That’s example of cloud computing that is more than storage.

Also, my friend who works for NASA rents/pays for some Amazon service where he can do computations on their servers.

Once marketers got hold of the term and realized it helped sell solutions, “cloud computing” became used for anything more remote than the local network accessing the local data center.

I’d be curious about the origin of the term. Comments?

My anecdotal experience as both a buyer and vendor of large-scale health systems is that the thing that drove the term in my world was the common use of a little Power Point cloud shape which began to be used to represent whatever happened between locality A and locality B (where the processes and data in A and B were specific responsibilities assigned to specific entities we could manage and evaluate).

So “cloud computing” went from something we mocked as happening magically “out there” to make the system work, to a term that carried a little panache–as if the “cloud” was a value add instead of simply a vaguely-defined remote function.

After the failure of thin-client computing ( your computer is just a terminal connected to the Great Server in the Sky ) as envisioned by Larry Ellison of Oracle; and the failure of Microsoft’s pay as you go rented apps ideas ( you don’t even license applications such as Office [ you never thought EULAS entitled you to buy stuff you paid for, did you ] instead you connect your terminal computer to the Great App Tree at Microsoft and keep giving them money for renting the use of these apps ), they needed a rebranding for the concept.

Instead of ‘cloud’, say ‘abstracted’ and you’ll be pretty close to the mark. You don’t care where the resources are, just that you get the service for which you pay.

Personally, I’m very wary of cloud services for the very simple reason that they are abstracted. Who else has access to your data? Are the CIA / KGB / DGSE / Mossad / WBC reading your data?

I do not have a scholarly etymology but I have a lot of experience :slight_smile: When I worked for a company in 1998, we used a frame relay connection for something (can’t remember exactly what now). If you look at the illustration in this article, you will see that your router communicates into “the cloud.” We all called it “the cloud” because you don’t know and you don’t care what happens between the two endpoints–it was just some nebulous pile of network connections. I think this “cloud” jargon spread to other contexts, the way that jargon often does.

Once a marketer gets a-hold of perfectly good technical jargon, it’s becomes ruined forever.

The cloud as a metaphor for a network has existed since at least the late 70’s. As mentioned above, it was just used as shorthand for “what’s in here doesn’t matter”. I first saw it used for X.25 networks, before the appearance of IP networks. X.25 standardized how to connect to a network; the nuts and bolts of how the service provider made it work were unimportant and not standardized and therefore rendered as a cloud.

When IP and CLNS came along, people kept using clouds in drawings to represent networks. In these cases, we knew more about the technology required inside the cloud, but didn’t care about the topology.

The metaphor applies nicely to cloud computing, so it makes sense that they use the term.

Regarding marketing and terminology, a favorite one is “switching”. What is L2 or L3 switching? It’s not technically defined at all, so it’s basically used to mean “bridging and routing, only FAST!”

Another “usual” component is dynamically scale. So you don’t need to worry about disk space or bound CPUs, because the environment will add that stuff magically.

These both talk about “clouds”, but any cites for when it become “the cloud”, as opposed to just “a cloud”?

This should clear it all up.

I don’t really know much about how it’s implemented, but the impression I get is that “the cloud” really falls into 2 main categories, software services and storage. For the latter, I don’t really understand the security issues since all you really need to do is encrypt the data. And even for software services, since those can run in dynamically created virtual machines, I still don’t see why security should be an issue assuming that you can verify that the the VM is secure.

I also see no one has mentioned distributed computing. I volunteer time on several of my personal machines including several fairly high end dual processor servers for medical research projects. They insure the validity of the work done on their project by having a quorum for each work unit. IOW, ever WU gets sent out 2 or more different client machines for processing. The results returned have to be validated and match or the result is discarded and the WU is reprocessed.

Using this method the project emulates a supercomputer with power something of the order of few petaflops.