Tiger - very slow indexing with spotlight?
I installed the Tiger Developer Preview on both, my PowerMac G5 Dual 2Ghz and my older iBook (G3/600, Radeon Graphics). While i erased the hard drive of my iBook before installing Tiger, i choosed the option "Install & Archive" on my G5. The iBook was done with indexing after a few minutes with the nearly empty hard disk after the first boot, the G5 is still indexing the drive after 13 hours uptime (150 GB used of 160GB).
Two processes are consuming CPU cycles, MDS (could be Meta Data Search) and MDImport (Meta Data Import).
Any expieriences if this is normal?
Two processes are consuming CPU cycles, MDS (could be Meta Data Search) and MDImport (Meta Data Import).
Any expieriences if this is normal?
Comments
Originally posted by Mendel
I installed the Tiger Developer Preview on both, my PowerMac G5 Dual 2Ghz and my older iBook (G3/600, Radeon Graphics). While i erased the hard drive of my iBook before installing Tiger, i choosed the option "Install & Archive" on my G5. The iBook was done with indexing after a few minutes with the nearly empty hard disk after the first boot, the G5 is still indexing the drive after 13 hours uptime (150 GB used of 160GB).
Two processes are consuming CPU cycles, MDS (could be Meta Data Search) and MDImport (Meta Data Import).
Any expieriences if this is normal?
For alfa software yes. Speed doesn´t really have any priority at this phase. As a developer you ought to know that.
While sampling the "mdimporter" and "mds" tasks from activity monitor, i know there is activity and that these tasks are not hung. They are doing something but with very little disk activity.
Would be great if somebody can post any insights!
slow_index_on_upgrade == BUG;
also, for OS updates, is it not standard to wipe the disk clean before installing, upgradeing OSs in my experiance results in a ton of bit rot, particularly in a development/multimedia enviornment wheere the user relies on hi proformance apps.
Is someone else here running Tiger and has this problem?
I mentioned my freshly formatted iBook doesn't have the Problem with the long and slow indexing. I copied my music (5 GB) to my iBook and the same processes (mdimport and mds) launched and started indexing. Lets see how long the iBook takes for the MP3s and if this is a problem specific to the MP3 Meta Data Importer.
Short story:
I updated my 14" iBook a few weeks ago which included a video update. Said video update reaked havoc on my system causing many psychodelic colors and eventually the grey "hold button to reboot" failure box. I got a Kernel panic "unable to load module" or something like that and consequantly had to reload the OS. I got everything reloaded and attempted to index and was unable to do it. Indexing ran for 24+ hours and wasn't even 1/3 the way through my 60gig HD before I said "phooey!" and stopped indexing. I ended up reloading OS X last week because I've had a few bigs crop up since my little crash (xCode stopped working for all accounts even though I reloaded it fresh and new a couple of times among other things). I indexed my HD last night and it was done by the time I woke up this morning (5 hours later).
It doesn't sound like its Tiger specific but rather a bug that's been around for some time that people haven't noticed (not a lot of people need to using indexing anyway).
[edit] just wanted to add that I had about the same number and type of files between the two index attempts.
The iBook did the same thing as the G5, so i started deleting the MP3s copied over from the G5 one by one. It seems like one MP3 was defective, after deleting it, the indexing continued normally and finished in minutes.
Originally posted by henryblackman
You must let Apple know - they may ask for that specific file, that way they can work around issues like this.
and if..."bad karma" was involved in the aqisition of said mp3...plead the fifth...
The MD importers seem to shit themselves on files without proper extensions, and some MP3 files too. It just goes over and over again on the same files, you can witness this by issuing a
sudo fs_usage
in the terminal. It took me alot of grepping and looking at the logs to determine the faulty files and folders.
I'm pretty sure this will be fixed by GM, it's a while from now.
Incidentally, much of the search in Tiger is available today (just get info on a drive and select the "index now" button in the content index section. This will index all files and their contents, but not the contents of mail, address book, etc.
Chad
--
Chad Burkey
Chad Burkey Photography
www.chadburkey.com
Spotlight indexes everything initially, and then when files are modified, re-indexes them?
Originally posted by burkey
Initial indexing will always be slow. Basically every document on the entire system has to be opened, scanned (feature extraction), and have its information (vector and score) added to the master index. This is a laborious initial process, but once completed only newly added or modified documents will need to be processed which hardly take any time at all. Tiger *should* be doing much of the initial indexing during idle time or at least as a very low priority process.
Incidentally, much of the search in Tiger is available today (just get info on a drive and select the "index now" button in the content index section. This will index all files and their contents, but not the contents of mail, address book, etc.
Chad
--
Chad Burkey
Chad Burkey Photography
www.chadburkey.com
Sounds like less of a proc intense job particularly on a g5, but the problem is the hdd, the hdds of today are pathetic, the tech of hdds has changed so little over the past 7-10 years that that is more often than not the bottle neck on data access.
Originally posted by Placebo
Am I correct:
Spotlight indexes everything initially, and then when files are modified, re-indexes them?
That is my understanding. It probably does reindexing on new or changed documents during idle times (at least that is the way I would design it).
Chad
--
Chad Burkey
Chad Burkey Photography
www.chadburkey.com
Originally posted by a_greer
Sounds like less of a proc intense job particularly on a g5, but the problem is the hdd, the hdds of today are pathetic, the tech of hdds has changed so little over the past 7-10 years that that is more often than not the bottle neck on data access.
While I agree with you in principle, it is still a fairly intense process. Indexing 400G of documents (most of which are photographs) takes more than a day with my dual G5 and an xserve raid. Building an index, essentially an n-dimensional matrix, can be a very complex process. There are things that can make this faster (such as using the GPU for the matrix calculations) or easier (such as using a database to store all the features extracted from the documents and allowing the dbms to handle the indexing) but in the end there is a potentially large amount of data that has to be processed and categorized. That is why incremental approaches (indexing an adding new documents to the index as they are created) are most always preferred to full index recreation when the corpus changes even though the results of the full recreation are always superior to the incremental results.
Chad
--
Chad Burkey
Chad Burkey Photography
www.chadburkey.com
Originally posted by burkey
That is my understanding. It probably does reindexing on new or changed documents during idle times (at least that is the way I would design it).
It tracks file creations, moves, changes, etc. and (re-)indexes the files after a short delay. It does not wait until the machine is idle.
You could remove the startup item, of course.
There's no "exclude these directories" prefs, there's no query delay prefs, nothing. However, even on this rather low-end machine (iBook 700), the CPU usage is alright. Dashboard, on the other hand, obviously has some widgets that "leak"...
Can a Panther Tester tell me
Probably not, since Panther doesn't include Spotlight :P
Originally posted by Mendel
I installed the Tiger Developer Preview on both, my PowerMac G5 Dual 2Ghz and my older iBook (G3/600, Radeon Graphics). While i erased the hard drive of my iBook before installing Tiger, i choosed the option "Install & Archive" on my G5. The iBook was done with indexing after a few minutes with the nearly empty hard disk after the first boot, the G5 is still indexing the drive after 13 hours uptime (150 GB used of 160GB).
Two processes are consuming CPU cycles, MDS (could be Meta Data Search) and MDImport (Meta Data Import).
Any expieriences if this is normal?
Apologies in advance if this has already been asked or answered, butI'm too lazy to read the thread.
Does Tiger use a new file system to that of Panther's, for the Spotlight stuff? I'm guessing no, but I thought I'd ask anyway. I'm thinking that Tiger puts invisible files, (much like .DS_Store), in directories, that may store the extra "meta-data", or something similar ? is this right?
Thanks. m.