Helldivers 2 - 85% reduction in install size with minimal performance impact
13 points by benton
13 points by benton
While I am personally very interested in this to free up some space on my Steam Deck, I've flagged as off-topic because there is effectively no technical information in this post, other than some vague references to "data duplication" affecting spinning rust users.
I would be very interested in a submission that actually detailed how they achieved this.
These loading time projections were based on industry data - comparing the loading times between SSD and HDD users where data duplication was and was not used. In the worst cases, a 5x difference was reported between instances that used duplication and those that did not. We were being very conservative and doubled that projection again to account for unknown unknowns.
We now know that, contrary to most games, the majority of the loading time in HELLDIVERS 2 is due to level-generation rather than asset loading. This level generation happens in parallel with loading assets from the disk and so is the main determining factor of the loading time. We now know that this is true even for users with mechanical HDDs.
There is no complex technical how, they intentionally duplicated the game data many times over (something common for many years in game dev, see eg this section on asset duplication in a PS5 talk: https://www.youtube.com/watch?v=ph8LyNIT9sg&t=629s), they hadn't done any actual benchmarking, and now they removed the duplicated data since they realized it was an enormous waste.
It does remind me of the JSON decoder in GTA :D
The game industry does something on their own without fully checking if it's viable at all.
I also had my toy json decoder and felt so validated using either charconv in C++ or preparsing with a max length and swapping the char to 0 just for the ato{i/f/...} calls.
And I had to use atof, because the float/double decoding of from_chars didn't work for gnu and clang for a very long time (at least until 2022, even though it was c++17).
Edit:
Watching the talk. I feel like that issue is solvable by checking which models are used on the neighbours.
If there are models of a mailbox on the neighbours, don't put them in the data pack of the city block.
Or alternatively. Do octree/quadtree groupings.
I wouldn't expect that the load times are impacted massively by separating duplication into 2,3,4 groups instead of just one chunk.
Installation Size
The installation size of HELLDIVERS 2 on PC seems to be a hot topic right now so let’s start with that. The current install size on PCs is around 150 GB. This is roughly three times larger than the same game installed on consoles! Given the amount of content in the game, the size on consoles seems quite reasonable so the obvious question is - why is it so large on PC? Data Duplication
Much of the data in the PC version of HELLDIVERS 2 is duplicated. The practice of duplicating data to reduce loading times is a game development technique that is primarily used to optimize games for older storage media, particularly mechanical Hard Disk Drives (HDDs) and optical discs like DVDs.
This practice is largely unnecessary for games deployed on Solid State Drives (SSDs) which is why the console versions of HELLDIVERS 2 do not do this.
Running commentary from the community is that the slim build doesn't have the duplicated files.
Which means that if the windows editions weren't so split up with features for $$$, we could enable data deduplication and spot/avoid the original issue earlier. https://learn.microsoft.com/en-us/windows-server/storage/data-deduplication/understand but... Server only.
Looks like there's ZFS for Windows, so maybe it's possible to sidestep MS' segmentation here?
https://github.com/openzfsonwindows/openzfs
I wouldn't put anything valuable on it, but a game that can be easily re-installed seems like it could be worth it.