Why are phone apps so huge nowadays?

I’m not sure this is FQ material, but I’ll give it a try. I just updated apps on my iPhone, and just like every time I do that I’m stunned by the size of the updates. Even a pretty basic (in my judgment) app with rather limited functionality will easily run up a few hundred MB. And that’s just the update, intended to fix “minor bugs”, not the app itself. I realise that graphics and visuals take up more space than they would have in the past, but is this the explanation? Or is it that developers rely on a range of frameworks that need to be installed along with the app instead of writing lower-level code themselves?

My understanding - and keep in mind I’m not an expert - is that some apps have background software that tracks you and records data, to be used by the app’s ownership for various things including selling the information to others. How prevalent that is I have no idea.

My understanding was that what was downloaded was a new version of the app, not just an update. But I am not basing that on anything in particular.

I’ve heard people ask the same question about apparently oversized updates for games on PCs, and the answer likely applies to phone apps too.

Apparently in many cases it’s because the way the program is compressed on the device, it often isn’t practical to update just the bits that need to be updated so that they need to re-download much or all of the program rather that just the relatively small update. Resulting in downloads of a gigabyte or ore for a small update, sometimes.

We are long past the days when everything in a program was hand-coded from the basics. For example, if the program needs to manipulate a picture, odds are it does not have a handwritten subroutine to do so - they use a commercially available “edit pictures” library which may have a huge amount of different subroutines, many of which are not used and the whole library is included. or date calculation routines, or even “what day of the week would that be?” is part of someone’s library. Plus, a lot of things like custom settings will use XML rather than a simple text file or some other home-made method of saving data; meaning there’s likely an XML library of subroutines too…

Plus, when updating, much simpler to download the whole program. Otherwise you either have to do an analysis - “if upgrading from 3.1 you need these modules, if you were on 2.94a you need these additional modules” - or you get around that by enforcing “to get to 3.3 from 2.94a you first need to update to 3.0”. Then the update program needs the smarts to figure that out and you need to keep a range of previous program versions accessible. It’s far simpler to just download and install the whole program, and if it overwrites some identical components, so be it. (Particularly in something like the App Store where you don’t get to put multiple versions out there; they probably don’t even want a separate update program; your basic install checks “is this a new install or an update?” and does the appropriate installation.) Just be sure the program you are writing can capture custom settings and data from previous versions when updating.

The previous posters got a lot of the technical reasons (standard included libraries, downloading the whole app not just the patch), but there is also the fact that there just isn’t really incentive to optimize for size. Most folks have a high-speed internet connection at least some of the time, and phone storage has gotten very large.

If you don’t have any reason to make your app smaller (i.e. customers demanding it) why put in the effort to do so? I know I have developed applications where entire feature sets are removed but the code is left in because why take the risk of removing it? Nobody cares if the app is a few MB larger than it needs to be.

I wonder (maybe slightly afield for FQ)… I understand that extremely efficient code can sometimes be hard to parse with human eyes. One of the benefits to not being as concerned about file size is that we can write programs that are easier to follow. Is there some other side of the spectrum, though, where we let programs get so bogged down with unnecessary cruft that it’s a huge effort to understand what is important and what isn’t? Or, if we’re using some functions from a library, but creating our own for some things where the library doesn’t give all the functionality we need, we’re then including library.function and in-house.function, and relying on documentation (or worse, institutional knowledge) to remember to not use library.function, or to not use it in certain situations.

A friend of mine, back in the day, was digging into his TRS80 and while disassembling the ROM code found at least one spot where the program jumped into the middle of a program op code. To save space, the programmers for Radio Shack had apparently scanned the code for a certain sequence of bytes for a shor subroutine, then used that rather than repeat the code,

Back then, an 8K chip was expensive.

Today, most phones have gigabytes of RAM. Meanwhile, being able to code quickly and read (understand) the code when modifications are needed is the proirity, rather than “clever” coding. Plus a lot of the simpler stuff is computer generated, even before AI. They generate individual subroutines, rather than trying to make a more generic routine serve several purposes. (Which also makes maintenance easier)

This is a great technical point when talking about application size.

Many tools that will auto-generate code do it in a way that prioritizes correctness and maintainability without worrying about size or speed at all (or at least that’s way down the pecking order). I’ve even known plenty of programmers myself who would much rather write dozens of routines that each do one specific thing than one routine that could be used over and over. And in plenty of cases (portability being one that springs to mind) that’s actually probably the correct design choice.

Even decades ago it was a mantra not to optimize your code until the end, and only do it then if it was actually required. I believe the quote was “premature optimization is the root of all evil”.

This isn’t wrong, but isn’t the reason for today’s bloated apps. Dramatically reducing the size of apps doesn’t require any cleverness whatsoever. Just a tiny degree of care about the problem. In many cases these apps are including resources (images, etc.) that aren’t used at all, but they couldn’t be bothered to remove them. Or redundant resources (like ones that have been duplicated for language/localization reasons, but don’t actually differ).

But as stated earlier, there’s just no incentive to improve things. Most software development is under time pressure. So when the choice is between adding a required feature and optimizing something that few will notice, it’s the feature that gets prioritized.

I once noticed a 10 MB PNG image of pure noise that was included in an app my company produced. More than 1% of the total package size. It was a test image that had been mistakenly included as part of the release package. Getting that removed took a fair amount of convincing, and I’m not even on the responsible team. Most of this stuff just slips by unless someone notices and makes a stink about it. And making a stink about things is often a means of making enemies.

App developer here. From my experience most of the app package consists of what are called assets – images, videos, fonts, audio files, etc. If an app has an introductory video that runs the first time the app is installed, it is often included in the app package. This could be 100’s of MB and is only seen once. Also, many apps include images for multiple screen resolutions and languages. These can really increase the size of an app package.

I take it then, there is rarely an attempt to delete this sort of thing once it’s been played?

And everyone was absolutely sure it was stochastic noise? The XZ Utils exploit comes to mind here.

Code bloat is just one of the continual ills that besets us. I had a quick look at the basic application sizes on my iPhone. Photos is 971 kilobytes. Yup, less than one megabyte. Apple clearly understands how to use the available libraries. Then I slam into some trivial app that scrapes and presents meteorological data and I see 400 megabytes. :enraged_face:

In principle the development environment provided to apps should provide an extraordinary rich capability. Many apps just code in HTML5. Let the browser engine do the work. I worry that we are just getting many copies of exactly the same framework libraries, one per app. A huge unsolved problem of software now seems to be how to manage the “you can’t get there from here” morass of constructing a consistent working tree of interdependent libraries that are all managed and evolve independently. Software artefacts get frozen into weird version combinations that force an entire local copy of every damn library it uses to come with it. Tools like npm, Maven, pip, Poetry, conda… I lose track.
It was bad enough needing to statically link executables. Now you statically define the source tree. Groan.
This is the price we pay for the ability and freedom to develop fast and build on fast development by others. So one might say that the pace of software has in part been enabled by not really needing to care about such bloat.

You’re telling me. Every attempt to “fix” the problem just becomes an enabler for worse problems in the future.

Containerization, for instance (Docker, etc.), is one panacea for these versioning issues. But when everybody has baked in a slightly different version of multi-gigabyte (don’t ask) libraries, things go downhill fast. The container builders like it because they can all iterate at their own pace and not worry about their stuff breaking with a new library version. It’s not as great for the clients of these containers.

All this kind of makes me happy I run my phone with the minimum apps I can get away with (OK, I have a crossword puzzle app, so sue me).

Then again, I have a J3, by today’s standards I have “insufficient” memory capabilities. Yes, I probably should upgrade, but as long as I can make phone calls and text that’s 95% of what I need from a phone. I’m making sure I can upgrade my PC to Windows II before getting a new phone.

Wasn’t that the basis of .dll hell in early Windows? Applications relied on assorted DLL librairies but as the versions evolved, some used newer versions, some relied on older versions with the same name, and loading them all in a common workspace was a bad idea. I ran across one situation where two apps used different .dll libraries with the same name - so if you ran A before B, it was fine. Run B before A, and A crashed immediately. The first one to run loaded its version of the .dll; I’m not sure why B worked fine, presumably it rarely called on that library, so rarely noticed if the wrong one was loaded.

And with current phone apps, the problem could be even worse - hence, it’s simplest if each app exists in its own little universe, only referencing outside libraries if built into the phone OS.

Basically. Microsoft largely solved the problem through a variety of methods, including virtualization of the files.

But more fundamentally the problem is with shitty libraries that were constantly breaking. There’s no reason a library should break things when it’s updated, unless it’s deliberately designed to be a breaking change, in which case it deserves a new version number.

Things got genuinely better here in the past 20ish years, in part because Microsoft started building in more components to the OS and tested them well. Say what you will about MS, but backwards compatibility has always been a strong point for them. Microsoft also started holding vendors to a higher standard, in part by subjecting them to testing.

In other areas, though, things are as bad as they’ve always been. Open Source, for all of its other benefits, has not been a boon since there are essentially no standards that libraries are held to. In many cases they don’t bother to keep the interface consistent, let alone care about breaking older apps.

So an entire segment of the software industry has cropped up to solve these problems. Package managers, dependency trackers, containers, container managers, and so on and so forth… all so that an app that depends on a dozen libraries can at least guarantee that the one “magic” combination it was developed with is still available. Sometimes it seems like half the industry is dedicated to fixing this problem.

I feel like overuse of libraries is also a factor here. Even incredibly basic things are being outsourced to libraries (a famous and dumb one being the npm left-pad incident, where a library that did nothing more than add spaces to the start of the string was deleted, leaving thousands of projects in the lurch).

See this response below:

It’s not that the developer couldn’t be bothered to remove supposedly redundant assets, they may not actually be redundant, but there’s no real benefit to parse out even the different versions of English in the world. Canada for instance uses US English keyboard layouts, Canadian French layouts, and there’s a rare but not unknown Canadian bilingual keyboard layout for people working in both languages. That may or may not be something the app developer has to account for (different accent locations, slightly different expectations for key preferences, etc.), but it’s just one example. Many of the interface assets are still just bitmap images rather than vectors. Since phones have high resolution screens, those assets have to be pretty large, and they also have to be present for all the different sizes of phones and screens and screen resolutions the app supports. That’ll add up very fast.

The way that apps are cryptographically signed nowadays for security purposes makes me think that even if it was possible/prudent to do this in the past, it’s probably not possible now. Modern computer and phone security relies on “is what I just downloaded and plan to install the full and complete and unaltered thing?” and “is what I just installed still the full and complete and unaltered version?” You also don’t know for sure that an app is done with an asset and will never need it again. If you have to reset the app to its first-run state, but it crashes or asks you to re-download a piece of it (hope you’re on wi-fi) that you then have to wait for, that’s a pretty bad user experience.

What I have in mind are assets that may or may not have localized text in them, but are duplicated nevertheless because some of those assets may need to be localized.

Ideally you’d only download the ones you need. I’m not sure if the various app stores can deliver a custom package based on the target device, but that would certainly be a nice improvement over including all assets in every possible resolution.

Every app gets a cache directory where it can put download-on-demand assets. If they wanted, they could have a 10 MB executable package and download images only when required and only at the desired resolution.

You’re right that, in part, bloating up the app to give a consistent first-run experience for everyone is probably part of the motivation. Nevertheless, they aren’t required to do that.