how many music recordings exist in the world?

For whatever reason, I’m curious how big a hard disk you’d need to contain all of the music recordings the world has ever produced. More specifically, I’m curious to compute when such a sized hard disk will be available priced such that the average person could own, say like a 100G HD today.

Any thoughts?

Thanks.

Unfortunately, you’ll have to qualify your answer quite a bit before you can even start for an answer. The first of which is: how are you going to define recordings: remixes, live versions, remasterings? second: how are you storing them (which may be the easiest one to answer): a 128 kbps mp3 is roughly 1Mb/minute of music. Thus, a full-length 70 minute CD will be approx. 70 Mb in size.

The first qualification is going to be the most difficult. Good luck.

I’d imagine that we’d run into petabytes (a thousand thousand gigabytes IIRC), so that’ll be a long time coming. By the time that drive’s produced, I think inflation will make it a bit more pricey than today’s 100Gb drives.

To be sure, it would be more helpful to qualify the question before you can even start to look for an answer.

As a starting point, the Gracenote CD service says it has:
[ul]
[li]2,775,786 CDs [/li][li]35,526,942 Songs[/li][/ul]I suspect that this might be the largest music database, although allmusic.com or musicmatch could have more. These stats include standard releases, greatest hits, live albums, bootlegs, and some duplicates. It also contains a few data CDs and some homemade CDs as well. It is probably light on classical and jazz CDs - and there are quite a few of these. Of course, it will not include anything that hasn’t been released on CD - old stuff, rare stuff, etc.

CaveMike, thanks very much for that one. I suspect that Gracenote has a huge percentage of all CDs made, so that gives me a great start.

So, that leaves recordings that still exist only on other media. I’m thinking specifically of recordings kept in the Smithsonian or such.

Chairman Pow wrote

Um, no. You can estimate anything in this world, and I already defined the precise thing I’m after “how many recordings exist in the world”. The process of estimation is simple. You decide what the right data is, you quantify it as best you can, and you add it all up. Estimating how accurate your estimate is, is reasonably simple as well.

FYI, here’s the mechanics of this estimation:
a) you estimate how many recordings exist (i.e. my question here)
b) for many (most?) of those recordings, you estimate their duration
c) you pick a compression ratio
d) you multiply all of the above together to get the total size.
e) you estimate the future rate of change in the cost/byte of disk storage
f) you solve the basic y=mx+b for the above to estimate the final answer.

This is not a valuable question. Can you estimate the number of remixes that exist more easily than you can the number of live recordings? of course not. In fact, if I answered your question, it would make the estimation less precise, not moreso because we’d have to throw in the inaccuracy of an additional estimatation. Most importantly, if you did want to answer this question, you would answer it after you had already answered the questions I posed above.

You have the cart before the horse. picking a compression ratio is just one part of the estimation, and easily the simplist one; you just pick a number (say 128 kbps). This doesn’t need to be defined before we estimate the number of recordings, and in fact has zero to do with the number of recordings.

Where exactly did this number come from? It sounds like you basically picked the biggest number you could think of (“hmm, terabytes exist today, and I know peta comes after terra”). The fact that you’re not even positive of the value of a petabyte even devalues your comment further.

I don’t mean to be harsh, but you’ve added zero in the way of data to my silly question. The fact that I’m asking a silly question doesn’t change the fact that it has an answer, and it doesn’t change the basic mechanics of doing an estimation.

My WAG is that there has to be at least a billion separate songs recorded out there. If you are going to include anything. I mean, I have personally recorded about 20 albums of my own music. If only 1/5 or so of the world population has ever sung a note into a tape recorder, and if that tape still exists somewhere, then you easily have 1B right there.

Further, we have over 100 years of home audio. Don’t underestimate the number extant tunes from before the 1920s. I would say millions right there.

Universal Music apparently has a 300,000 track back catalogue in europe(cite). Extrapolating, I think it would be fair to say the number of commercially produced music recordings in the world top out at somewhere about 1,000,000 to 10,000,000. Working with an average track length of 3:00 and 1Mb/min, we get about 3Tb to 30Tb woth of music.

Google style commoditized storage is roughly $1/Gb so let’s say $2 just to be daring and we get 3Tb at $6000 and 30Tb at a mere $60k.

Sounds like an amazing bargain to me.

Using another set of calculations, a 4Gb CF card costs about $400 and weights about 16g (cite). This would mean storing all the worlds music onto 750 - 7500 CF cards would cost between $300,000 and $3,000,000 and weight you down by 12 - 120Kg.

Alternatively, 75 - 750 ipods would cost only $37,500 - $375,000 and weight about the same. A much better solution IMHO.

This is an immensely valuable question. If you exclude remixes, you’re excluding tens of thousands of songs (or possibly more). If you exclude live versions, same deal. Thus your theoretical storage space will be much smaller.

Think of it in terms of Classical music. Is ONE recording of a particular work enough, or do you want to preserve ALL recordings of that work? This could multiply the classical section of your collection by three or four times, certainly significant. The same goes for “covers” of contemporary music, although they make up a much smaller percentage of the whole. If you want any kind of a meaningful estimate, these are questions you need to answer.

It doesn’t really matter what the exact number is because even if my estimates were off by a factor of 100, it would STILL be cheap enough for any major recording star to buy with one of their recording contracts with plenty of pocket change.

Calm down, dude. I just said you’d have to qualify your question. My list of qualifiers was not irrelevant, but possibly incomplete (although it was an open-ended list). From your subsequent posts, I imagine that you’re after “published” music and not demos that were recorded in someone’s basement and never released, or sent to record labels in hopes of getting a deal, but never even listened to. How about somoene taping “happy birthday” to send to their grandmother? How about five people bootlegging the same concert, do each of those count as a separate recording? Asking for qualifiers is valid as I/we don’t know.

Sure it is. Are you including remixes, live recordings, et al., or not?

You are correct: it has nothing to do with the number of recordings, but that is irrelevant. There will be a difference in the amount of space that will be needed on your hard disk if you choose to put it in 128 kbps mp3 or 44Khz Wav. If we assume (don’t go apoplectic when I make an estimate/assumption here, after all, your question is implicitly estimating, even without those awful qualifiers) and say that 70 min. works out to 70 mb on 128 kbps mp3 and 650 on CD, you’re nearly a factor of ten off. I think that’s statistically significant. A hard drive that’s a factor of 10 sizes larger would have a fairly large price difference/amount of time before it’s available.

Petabytes don’t exist today? Consider the IIRC to be the equivalent of, “ummm” and rolling my eyes back into my head while thinking. Surely that’s not too far of a stretch.
[/quote]

Sounds like someone’s feeling a bit insecure. I never claimed your question was silly and indeed, I prompted you for more information so I could help you solve the question. Take a deep breath, it’s OK.

Reread the OP, he’s looking to put it on a single hard drive and have it be priced to go.

Not that the idea of having 750 iPods hooked up together isn’t fun. How the hell would you navigate the playlist?

I wouldn’t underestimate the number of professionally produced recordings on vinyl that never made it to CD. My guess is that there’s a rather large body of work, especially in less mainstream genres (ie, not pop/rock) that didn’t make the transition.

So, I’ve got the rough estimate I was looking for. For anyone who’s curious, here it is.

Basically, I used CaveMike’s number of 35m songs on Gracenote, and made these assumptions:

  • Gracenote contains 1/4 of all recordings. the definition of “recording” (as many have pointed out) is vague, and Aeschines makes a good case about amateur recordings, but I’m really after some level of a commercial release, not just every time someone sang happy birthday, and someone else just happened to be there with a tape recorder. I’ll happily concede that 1/4 may be large; perhaps it’s even 1/40th, but even an order of magnitude isn’t significant in the final result of things as the hard disk size grows exponentially, so a factor of ten error here equates to a 5 year difference, which is minor in the scope of things.
  • Average length of a recording is 3 minutes
  • density of recording = 1 mb/min
  • current HD size that the average person could own = 100G
  • years for HD size to double for similar cost = 16 months (source Wired magazine, 1998, and not surprisingly, using their data, 16 months has proven accurate through 2004)
  • No inflation (hah! but I didn’t feel like dealing with it, and as it turns out, it’s not that significant)

Doing all the math with the above (which I’m happy to post for the curious), the answer is 29 years.

I.e. in 30 years give or take, the average person will be able to walk into Fry’s and buy a hard disk capable of holding every commercial musical recording ever made.

Before starting this thing, I didn’t know if it would be 200 years or 500 years or what. I think it’s interesting that it’s not that far away in the scope of things.

Thanks especially to CaveMike, Shalmanese, and Aeschines for the very helpful responses.

You didn’t take into account the music that will be recorded between now and whenever this theoretical HD will be released, did you? Considering that recorded music hasn’t been around all that long, and (assumption) recording (esp. commercial recording) has grown exponentially, there should be quite a bit more music by the time that HD is released.
Unless you’re going for a “20th century” music library.

From here.

That quote only talks about the live concerts (not even commercially released works) by tape friendly artists stored by Archive.org.
So there’s 10 Terabytes right there, without ever touching commerically released music.

And to put that in perspective, I poked around a bit - and in 2003 Warner Music Group released 7,581 original titles. That’s just one of the Big Five music label groups, and only for one (crappy) year. Think about how many smaller and indie labels, there are, and how many years they’ve been recording music.

From here.
BTW, Archive.org is the single greatest live music resource on the web that I know of, and a fascinating project all around. Plus, everything there is legal and free to the public.

Oops!

Upon re-reading of my post above I see that Nielsen Soundscan recorded 7,581 titles in all for 2003. WMG only had about a thousand of them.

Sorry about that, but I think we’re still looking at petabytes of info when you think that for every album recorded by Nielsen, four or five are commercially released for sale by artists registering them.

No cite on that 4:1 or 5:1 ratio, but that’s my professional opinion as someone in the business working with local, regional and national talent. I see, hear and interact with a hell of a lot of bands at all levels of their careers and only a small percentage of them have gone soundscan, or even UPCs.

But that’s just an opinion, so take it for what it’s worth…

Garfield226 wrote

This is true. However, even if the amount of music recorded since the dawn of recording doubles in the next thirty years (not far fetched), it’ll only change my number from 29 years to a bit over 30 years. If the amount of music ever recorded quadruples in the next 30 years, my number will go from 29 years to 31.

I really doubt the production of new music is growing exponentially. It is true that the growth rate will be sharper than in the past, especially since everything will start off being available digitally, and the ease of music production has gotten and will continue to get much easier. However, the number of consumers of music won’t grow at nearly the same rate as technology (plus the rate of growth for humans is slowing down). Also (and most importantly), I suspect the number of musicians won’t grow dramatically as a percentage of humanity.

You make a good point though.

picker, that’s some very interesting info; thanks. Also, thanks for the pointer to archive.org; it does look like a cool project. Unfortunately, I’ve only heard of 16 of their 500 bands. I hope they do well, but they’ll need some significant traction. I wouldn’t be surprised to hear that a very high percentage of their impressive download rate is from the Grateful Dead and Tenacious D.

Side note one: one of the artists on archive.org had a band that opened up for a band I had once upon a time. And I was hardly big by anyone’s standards. Also, his band recorded some stuff in a small music studio I’d set up in my apartment bedroom. Though I’m impressed with his current music, it’s obvious he’s hardly a superstar. I assume many on the archive.org “label” are of similar stature.

Side note two: Archive.org’s numbers seem to indicate that they aren’t storing the music compressed. If they stored it as 128kbps mp3s, they could store it in less than 1/20th the space. Even storing it in a lossless compression format would significantly reduce their required resources.

Hi Bill H.

Actually, the music on Archive.org is compressed - using .shn or FLAC compression - both are lossless compression technologies that reduce file size approximately 50%. The reason that there is so much data is that these files tend to be complete live shows, ranging for 75-200 minutes, rather than the 50-65 minutes you tend to find for studio projects/releases.

Shorten is the software I prefer - it’s easy and free. and there’s an inherent chekcsum (data verification tool) built right in. For Windoze and for Mac or Linux.

I would estimate that I recognize the names of approximately 50-60% of the artists on there at best, and I am extremely well-versed in jam and trade friendly music. Pretty impressive collection, eh?

Oh yeah -

for more info on the vast amount of free downaloadable music out there, check out
etree. It’s a lot more technical than Archive, but there’s even more material out there, and their server lists are amazing.

And if you want mp3 or Ogg Vorbis as well as SHN, check out FurthurNet - it’s more of a Kazaa style model, but no spyware and it only allows music from an approved list of artists (albeit a very impressive list) - artists must specifically grant permission for their music to be indexed, and can also specifically exclude certain shows (i.e. one’s that are being recorded for release or what have you).

It’s still officially a Beta, but I’ve used it for several years on both Mac and PC without issues.

The neatest thing is that you can get REALLY specific with searches - artist (of course), date range, city, state, file format (shn, ov, mp3) and more. And you WILL know a lot of the bands on there.

-P