View Full Version : Why Zip Comression in Panther?
tspencer83
08-21-2003, 01:30 PM
Hi, just wondered if anyone had any idea as to why Apple would choose to incorporate .zip compression instead of .sit compression into Panther. I was always under the impression that .sit and now .sitx compression were both far better than .zip compression. Just curious. Thanks
JLL is correct. *.sit and *.sitx are proprietary, closed formats owned by Aladdin Systems.
Also, I've heard terrible things regarding speed and the sitx format.
Moving to Mac OS X.
thuh Freak
08-21-2003, 03:11 PM
incorporate how? don't they give away stuffit expander anymore? how come they don't use tarballs (.tgz, .tar.gz)?
jwill
08-21-2003, 07:09 PM
I have no idea..I don't mind the .zip, though, and since it's built into the system, you wonder why someone wouldn't use it.
Even though I think .sit files are smaller than .zip..I know that .sit is property of Aladdin, so they can't use that.
LoCash
08-21-2003, 10:10 PM
.zip is a bit more friendly to most consumers, particularly people familiar with Windows. If you offer .zip support at the system level in Mac OS X, it helps make people feel more comfortable.
That's my theory...
Barto
08-21-2003, 10:19 PM
If you want to share files with Windows users, you ZIP it. It is a universal archiving format.
For more convinent archiving, you turn a folder into a disc image. Unfortunatly, as yet, DMGs only work in Mac OS X. So to send files amongst Mac users, you disc image something.
In-ter-op-er-a-bil-i-ty
Barto
jwill
08-21-2003, 11:02 PM
Originally posted by Barto
If you want to share files with Windows users, you ZIP it. It is a universal archiving format.
For more convinent archiving, you turn a folder into a disc image. Unfortunatly, as yet, DMGs only work in Mac OS X. So to send files amongst Mac users, you disc image something.
In-ter-op-er-a-bil-i-ty
Barto
Makes a lot of sense to me.:smokey:
Aquatic
08-21-2003, 11:05 PM
Because Aladdin is going out of business. I used to love them. But their product just isn't keeping up. Expander isn't multithreaded!? It can't decompress two or more files at the same time, so if you want to expand a 1k file while a huge SITX is unstuffing too bad. It can not uncompress damaged or incomplete files. I have to keep a copy of MindExpander around and fire it up in Classic for that. If freeware can do it why not Expander 7? They are becoming content with mediocre software so I am guessing in a few years they won't be around. Sure ZIP is less efficient but it's on Windows. tarball isn't. PC users know what ZIP is, they haven't heard of UNIX. So this is good I think.
Amorph
08-21-2003, 11:44 PM
Originally posted by Aquatic
Because Aladdin is going out of business.
Not surprisingly, given that Stuffit has gotten more and more obnoxious and less and less useful, and Aladdin have been requiring email addresses and personal information, piling on commercial mail and generally being annoying. It used to be a great company...
ZIP isn't bad, nobody really owns it (well, except for the implementation that uses the same compression algorithm in GIF), and every platform understands it.
(Anybody remember Compact Pro?)
torifile
08-22-2003, 12:02 AM
winzip uncompresses .tgz just fine. I use it just to throw people off. Of course, they double click it without knowing what the hell the extension means. :no: I swear, the people I work with are an IT person's nightmare.
iBrowse
08-22-2003, 12:21 AM
As far as I know, WinZip can open .sit files just fine.
Spart
08-22-2003, 12:28 AM
Which means nothing because Apple cannot use Aladdin's formats regardless of how well they can be used on the Windows side of things?
I think incorporating .zip is a great idea, though I would love to see them do .tgz and other formats. At least to the point where you can expand them using the Finder, if not create them.
Barto
08-22-2003, 12:33 AM
Originally posted by Spart
Which means nothing because Apple cannot use Aladdin's formats regardless of how well they can be used on the Windows side of things?
I think incorporating .zip is a great idea, though I would love to see them do .tgz and other formats. At least to the point where you can expand them using the Finder, if not create them.
Ditto. It shouldn't be hard to do either (at all). In fact, all that is needed is a shell call to tar and gzip.
Barto
dfiler
08-22-2003, 07:30 AM
The quality of Stuffit has been declining for a few years now. It seems that the software has actually gotten worse, becoming more complicated without adding useful features. (like multi-threading)
I suspect that the person originally responsible for Stuffit's quality/reputation is now gone. It's not too uncommon for a single person or perhaps a handful of people to make or break an application. Anyone got details on this? Has staffing changed at Aladdin? If so, perhaps apple just recognized that a niche format isn't a good choice when it appears to be heading down hill.
kim kap sol
08-22-2003, 08:58 AM
Originally posted by dfiler
The quality of Stuffit has been declining for a few years now. It seems that the software has actually gotten worse, becoming more complicated without adding useful features. (like multi-threading)
I suspect that the person originally responsible for Stuffit's quality/reputation is now gone. It's not too uncommon for a single person or perhaps a handful of people to make or break an application. Anyone got details on this? Has staffing changed at Aladdin? If so, perhaps apple just recognized that a niche format isn't a good choice when it appears to be heading down hill.
Little Raymond Lau was the author of StuffIt back in the days when StuffIt was excellent. Looking at Aladdin's current software quality, I'm almost certain Raymond Lau is gone.
Lau stopped coding StuffIt in January 1995 (http://www.raylau.com/StuffIt.html)...it was about that time when StuffIt starting going downhill...thus confirming my first paragraph.
Zadak
08-28-2003, 07:39 AM
Originally posted by Amorph
(Anybody remember Compact Pro?)
Ah, these memorys. LC II and Compact Pro was a wonderful combo. Sort. of.
der Kopf
08-28-2003, 07:50 AM
I'd rather see gzipped and bzipped tarballs as Panther's default too, though for no rational reason. I don't doubt that eventually, Apple will allow us to choose what the standard method of archiving will be. I do hope Panther will solve the issue where contextual menus in the Finder take up to 30 seconds to show. To be honest, the Finder is most probably the Jaguar application I dread using most. It is by far the slowest and most annyoing. Following all reports, and personal tries, I cannot wait for Panther.
Anonymous Karma
08-28-2003, 07:53 AM
Zip, at least in theory, can handle resource forks. I have not yet tested to see if this is so in Panther or if they just AppleDouble it.
Originally posted by Anonymous Karma
Zip, at least in theory, can handle resource forks. I have not yet tested to see if this is so in Panther or if they just AppleDouble it.
I don't know what they use but resource forks are preserved.
andrewm
08-28-2003, 05:38 PM
StuffIt formats aside, when one considers that the BSD subsystem of OS X includes programs for zip (.zip), gzip (.gz), bzip2 (.bz2 and .bz), tar (.tar), and the disused compress (.Z), I think that Apple should provide Finder support for, at the very least, decompression of these formats, without the need to open either Terminal or a third-party program--well-coded or not--such as StuffIt Expander, OpenUp, et alii.
The Finder preferences in Panther have an Advanced pane. In this pane could be added an option for the default format. Apple might also implement a contextual menu solution to allow compression, on-the-fly, into any format selected from this menu.
apeiros
08-29-2003, 03:03 AM
Originally posted by andrewm
StuffIt formats aside, when one considers that the BSD subsystem of OS X includes programs for zip (.zip), gzip (.gz), bzip2 (.bz2 and .bz), tar (.tar), and the disused compress (.Z), I think that Apple should provide Finder support for, at the very least, [b]decompression of these formats, without the need to open either Terminal or a third-party program
They should go even further and implement live-browsing of those archives in a normal finder window (possibly with an additional column like "compression" which indicates how heavy the file got compressed in percents). Given the fact that most Linux distribution already got this feature it is a shame that Apple hasn't come up with this yet...
Eugene
08-29-2003, 05:22 AM
Originally posted by andrewm
StuffIt formats aside, when one considers that the BSD subsystem of OS X includes programs for zip (.zip), gzip (.gz), bzip2 (.bz2 and .bz), tar (.tar) ... I think that Apple should provide Finder support for, at the very least, decompression of these formats...
Nitpicking, but a .tar file by itself does not utilize compression, right? ;)
Barto
08-29-2003, 07:50 AM
It may not utilize compression, but what it achieves is compression. Explaination below.
Simple archiving of files (eg tarring) DOES compress them. Due to the way disks work, instead of allocating a set of bits or bytes to a file, the file system allocates "blocks" (but I'm sure you know that).
On my hard drive, they are 4KB each. You will therefore have roughly 2KB of wasted space per file. Now times that by 100,000 files (average file system), and you get 200MB of wasted space.
This particular instance of wasted space is eliminated when you archive with tar. What you achieve is essentially compression.
I think it is quite reasonable to call the extraction of a tar archive "decompression" from the user's perspective, if not the developer's perspective.
***NITPICKER 2000 BOT DEACTIVATED***
Barto
dfiler
08-29-2003, 08:05 AM
This is probably why tar is used for light-weight compression on *nix systems. Tar works well for numerous files that are smaller than the (minimum) block-size.
Live browsing of archives, while convenient, has interesting implications with regards to user interfaces. There are issues with user awareness of the distinctions between files and archives. Many or perhaps even most users are incapable of grasping the difference let alone capable of discerning which they are working with.
With that said, I hope apple implements an extensible compression API with an easily accessible front end via the finder. They could also set up a server for distributing certified modules such that users would be prompted to download the correct, missing module when trying to decompress a file. (ala quicktime)
giant
08-29-2003, 05:29 PM
Originally posted by iBrowse
As far as I know, WinZip can open .sit files just fine.
I just tried and can't get it to do it
1337_5L4Xx0R
08-31-2003, 02:01 AM
Live browsing of archives, while convenient, has interesting implications with regards to user interfaces. There are issues with user awareness of the distinctions between files and archives. Many or perhaps even most users are incapable of grasping the difference let alone capable of discerning which they are working with.
If done correctly, the user need never know that the file is in a compressed archive. Besides, users understand disks, what's so different about archives?
Barto
08-31-2003, 06:23 AM
This is why disc images are so good. Because users understand disks. Any user can figure out, "Ok, I've got files on this disk, so I'll drag them to the disk I want" (or double click the installer).
Now, archives are more complex. They are a folder, which has morphed into a single file. You need to extract the folder from that file before it is a folder again. Now you've extracted it, you have a folder and the original file.
This is simple to you or I. However, Joe C. Illiterate will go straight into *DUH* mode when you tell him an archive is a folder compressed into a single file that you need to expand for your files to appear.
-------------
The other solution is to have archives behave like folders (ala XP). Except users get pissed off when the performance of their computer halves due to decompression overhead.
Barto
dfiler
08-31-2003, 10:05 AM
Originally posted by 1337_5L4Xx0R
If done correctly, the user need never know that the file is in a compressed archive. Besides, users understand disks, what's so different about archives? I'm definitely knit-picking here but bear with me. The distinction isn't a huge problem but Apple has a history of limiting their interfaces to account for rare but confusing/dangerous scenarios.
If a user is unaware that they are working with an archive and then try to share a portion of the archive contents with another user, problems can arise. File sizes will start shifting or the files might be emailed elsewhere without the user realizing that the recipient must have special utilities and knowledge to view the files.
If clueless user has a CD's worth of mp3s in a compressed archive and then tries to burn them for a PC user... what happens? Either not all of the files will fit, or the PC user must be know how to extract the files. This is a simple scenario for most of us to understand Unfortunately, we can't expect the vast majority of users to fully understand what is going on behind the scenes and to then conceive a work around.
All but a couple of the 40-60 year old office workers at my workplace would be completely lost. Similarly, the education majors and young teachers I know would be confused. "What? This is a special folder which must be treated differently?" The designer in the office next to me would certainly run into problems exchanging quark files (and their ingredients) with sister companies and publishers. Frequently, it is necessary to re-explain the options for embedding graphics and fonts. There are all kinds of implications for what types of applications the recipient of such a file must have. Also, it causes the file sizes to change in a manner contrary to the designers intuition.
Hiding the distinction between folders and archives will confuse users. Users must treat the folders and archives differently or they will eventually run into task ending failures in their workflow.
Barto
08-31-2003, 05:41 PM
I'll nit-pick your nit-pick then.
Whenever there is a problem with Quark, it's Quark's problem, not Apple's. No exceptions. ;)
Barto
dobby
09-02-2003, 04:02 PM
Very few platforms don't support the .zip compression standard. Aladdin is proprietry and can't be used from the command line without buying it!
What would be really useful is a command line way of keeping the files resource fork when transfering the file over the internet.
Dobby.
klinux
09-02-2003, 07:14 PM
Originally posted by Barto
It may not utilize compression, but what it achieves is compression. Explaination below.
Simple archiving of files (eg tarring) DOES compress them. Due to the way disks work, instead of allocating a set of bits or bytes to a file, the file system allocates "blocks" (but I'm sure you know that).
On my hard drive, they are 4KB each. You will therefore have roughly 2KB of wasted space per file. Now times that by 100,000 files (average file system), and you get 200MB of wasted space.
This particular instance of wasted space is eliminated when you archive with tar. What you achieve is essentially compression.
I think it is quite reasonable to call the extraction of a tar archive "decompression" from the user's perspective, if not the developer's perspective.
***NITPICKER 2000 BOT DEACTIVATED***
Barto
Nitpick? :) You are still not correct, technically or not. TAR, while often used with compression software, has nothing to do with compression.
Sure, you are possibly reducing size-on-disk, which is variable from machine to machine, but not the actual size of the files. The reduction of size-on-disk is really cluster size dependent, not as a function of TAR.
Heck, does moving your 100k files to a drive with smaller cluster size count as compression? Or move those files to a large cluster HD count as decompression?
Kickaha
09-02-2003, 07:35 PM
I think the point was that it saves space by removing wasted empty bits. ;)
Po-tay-to, po-tah-to
Barto
09-02-2003, 10:31 PM
I didn't say TAR is compression. I said this:
I think it is quite reasonable to call the extraction of a tar archive "decompression" from the user's perspective, if not the developer's perspective.
From a user's perspective, turning something into an archive and saving 200MB of space is compression, algorithmic or not.
Barto
John Whitney
09-03-2003, 11:52 AM
Originally posted by Barto
I didn't say TAR is compression. I said this:
I think it is quite reasonable to call the extraction of a tar archive "decompression" from the user's perspective, if not the developer's perspective.
From a user's perspective, turning something into an archive and saving 200MB of space is compression, algorithmic or not.
Barto
<ultra nitpicking>
Hrmph. With a standard block size of 1k, that means a max of 1023 wasted bytes per file. This means it'd take a tar archive of 205,000 files to save 200MB, and that's the low end. :-)
Given an average wastage of 512 bytes/file, it'd take a tar archive with 2048 files to even recover 1MB of disk space. I doubt the user would even notice any space savings of under 1MB. 2048 is still a rather large archive.
Personally, I wouldn't extraction from a tar file "decompression" at all, I'd call it "unpacking". :-)
</ultra nitpicking>
thuh Freak
09-03-2003, 04:14 PM
Originally posted by John Whitney
Personally, I wouldn't extraction from a tar file "decompression" at all, I'd call it "unpacking". :-)
i usually call it untar'ing.
Barto
09-03-2003, 09:57 PM
Originally posted by John Whitney
<ultra nitpicking>
Hrmph. With a standard block size of 1k, that means a max of 1023 wasted bytes per file. This means it'd take a tar archive of 205,000 files to save 200MB, and that's the low end. :-)
Given an average wastage of 512 bytes/file, it'd take a tar archive with 2048 files to even recover 1MB of disk space. I doubt the user would even notice any space savings of under 1MB. 2048 is still a rather large archive.
Personally, I wouldn't extraction from a tar file "decompression" at all, I'd call it "unpacking". :-)
</ultra nitpicking>
Last time I checked, 90% of hard drives have 4KB block sizes. It would only take about 100,000 files. The example I used was tarring an average hard drive, presumably to backup on tape.
Now, I wouldn't call it decompression. But what I did say (and keep saying) is from an average user's perspective, in fact any user's perspective, untarring achieves the same as decompression. Untarring is effectively decompression at the user side.
Barto
Kickaha
09-04-2003, 12:05 AM
By that logic, so was moving a file from a 2GB drive to a 80GB drive under HFS... the block size grew, so the 'file size' did as well... :P
Can we all just agree that this semantic equine has been well and truly sadonecrobestialized and move on, or let this thread die?
Barto
09-04-2003, 12:20 AM
You would need to be moving a lot of files, and the process is not the same as associated with a compressed file like ZIP or SIT (double-clicking an archive to create a folder). :p
Barto
Kickaha
09-04-2003, 12:32 AM
Sadonecrobestiality is all that's happening here.
Someone else want to jump in with a reason why I *shouldn't* close this thread?
Amorph
09-04-2003, 12:34 AM
Originally posted by Kickaha
Someone else want to jump in with a reason why I *shouldn't* close this thread?
Because I want to add that this appears to be a user interface problem, and it would appear to me that displaying a compressed archive as a disk image with a tight belt, or a zipper, or some other indication that it wasn't just a disk image, would solve the "is it many or is it one" question elegantly enough.
OK, now you can close the thread. Before anyone has time to shoot down my idea. ;)
John Whitney
09-04-2003, 05:05 AM
Originally posted by Barto
Last time I checked, 90% of hard drives have 4KB block sizes. It would only take about 100,000 files. The example I used was tarring an average hard drive, presumably to backup on tape.
Now, I wouldn't call it decompression. But what I did say (and keep saying) is from an average user's perspective, in fact any user's perspective, untarring achieves the same as decompression. Untarring is effectively decompression at the user side.
Barto
Whoops! You're right, learn something new every day. :) Apparently, HFS+ uses a default 4K block, and HFS uses a block size based on disk size.
As for closing the thread, why? People are being quite civil. Are boring technical interactions not allowed on AI?
John
Kickaha
09-04-2003, 07:34 AM
Not when the moderator had been up for over 40 hours straight and demanded entertainment from his fiefdom of threads, no. :D
Carry on, carry on... (carrion? Aw crap, now we're back to the sadonecrobestiality....)
Why not bz2 or gz?
Those formats don't support random access, you provide one file (a .tar archive) and they give you a compressed version of that file. They don't know anything about its content. However, they are more efficient than other formats.
Zip, on the other hand, doesn't provide very good compression rates, but you can delete files from archives, add new files, extract just a few files etc.
If Apple intends to provide partial on-the-fly extraction in the future (display the archive as a disk image/folder, whatever), then they simply don't have much choice among the available formats.
Eugene
09-04-2003, 03:51 PM
Originally posted by John Whitney
Whoops! You're right, learn something new every day. :) Apparently, HFS+ uses a default 4K block, and HFS uses a block size based on disk size.
As for closing the thread, why? People are being quite civil. Are boring technical interactions not allowed on AI?
John
Well, I'd venture to guess 90% of the world uses Windows, and roughly half of them in turn use XP/2000/NT and NTFS, which has an utterly complex variable blocksize, which at least allows for down to 1K blocksizes.
As well, I'm not sure that HDDs have 4K blocksizes. Most modern HDDs still have 512 Byte sector/block sizes.
John Whitney
09-04-2003, 09:50 PM
Originally posted by Eugene
Well, I'd venture to guess 90% of the world uses Windows, and roughly half of them in turn use XP/2000/NT and NTFS, which has an utterly complex variable blocksize, which at least allows for down to 1K blocksizes.
As well, I'm not sure that HDDs have 4K blocksizes. Most modern HDDs still have 512 Byte sector/block sizes.
As far as I know, block size is not related to the HDD in most cases. I found an Apple document stating that HFS+ has a default 4K block size, while HFS was variable (apparently, HFS has a max number of blocks, around 64k of them). I'm more used to a 1K block size on Linux filesystems, so the 4K surprised me.
It's a tradeoff. Less disk space loss in filesystem overhead, more per file.
Originally posted by John Whitney
As far as I know, block size is not related to the HDD in most cases. I found an Apple document stating that HFS+ has a default 4K block size, while HFS was variable (apparently, HFS has a max number of blocks, around 64k of them). Not quite.
Block size is always related to hard drive size. The size of the blocks grows directly with the size of the drive. The difference today is that there are many more blocks.
HFS supports 16 bits for block allocation and a total of 65,536 blocks.
HFS+ supports 32 bits for block allocation and a total of 4,294,967,296 blocks.
John Whitney
09-05-2003, 05:08 AM
Originally posted by Brad
The size of the blocks grows directly with the size of the drive.
That doesn't sound quite right either. How about, "there is a minimum size blocks have to be based on number of blocks and size of the drive, but often the size will be greater than that minimum"?
The HFS+ default 4K will be used on all hard drives until the size of the drive exceeds 16TB. After 16TB, the block size would grow directly with HDD size. The same was probably true for HFS on drives up to 256MB.
Sound correct?
John
Barto
09-05-2003, 06:14 AM
Sounds very correct to me. I would check the 16TB figure, but I cannot be bothered.
The transition between HFS and HFS+ was very painful, because hard drives increased from ~200MB to several gigabytes before HFS+ came into operation. Block sizes of 32k and 64k led to huge bloat in volumes.
Barto
Well, I was right about the numbers I gave. The absolute *minimum* block size, though, is 512 bytes as Eugene said, not 4 KB. A quote from Apple TN #1150 (http://developer.apple.com/technotes/tn/tn1150.html):
HFS divides the total space on a volume into equal-sized pieces called allocation blocks. It uses 16-bit fields to identify a particular allocation block, so there must be less than 2^16 (65,536) allocation blocks on an HFS volume. The size of an allocation block is typically the smallest multiple of 512 such that there are less than 65,536 allocation blocks on the volume (i.e., the volume size divided by 65,535, rounded up to a multiple of 512). Any non-empty fork must occupy an integral number of allocation blocks. This means that the amount of space occupied by a fork is rounded up to a multiple of the allocation block size. As volumes (and therefore allocation blocks) get bigger, the amount of allocated but unused space increases.
HFS Plus uses 32-bit values to identify allocation blocks. This allows up to 2^32 (4,294,967,296) allocation blocks on a volume. More allocation blocks means a smaller allocation block size, especially on volumes of 1 GB or larger, which in turn means less average wasted space (the fraction of an allocation block at the end of a fork, where the entire allocation block is not actually used). It also means you can have more files, since the available space can be more finely distributed among a larger number of files. This change is especially beneficial if the volume contains a large number of small files.
John Whitney
09-05-2003, 09:30 AM
Originally posted by Brad
Well, I was right about the numbers I gave. The absolute *minimum* block size, though, is 512 bytes as Eugene said, not 4 KB. A quote from Apple TN #1150 (http://developer.apple.com/technotes/tn/tn1150.html):
True, but I was talking about default sizes, not minimum, which as per this (http://docs.info.apple.com/article.html?artnum=25557) Apple article, is 4KB.
The actual number of files that can be stored on an Mac OS Extended (HFS Plus) volume depends on the size of the volume and the size of the files. For example, a 160 GB HFS Plus volume with the default block size of 4K will have 40 million available blocks. This volume could store up to 40 million very small files. A bigger disk with the same default block size could hold proportionately more files.
This just means that unless you explicitly modify the block size when creating the filesystem, you'll get 4KB blocks on anything smaller than a 16TB disk. Even using the minimum 512 byte blocks will work until drive sizes exceed 2TB. In either case, this is probably a fair distance off in the future (although you never know, these days... drive sizes are getting larger quickly).
I think we've hashed it out now: drive size can play a role in block sizes in HFS+, but not with any currently available drive. :)
John
vBulletin® v3.8.4, Copyright ©2000-2010, Jelsoft Enterprises Ltd.