Tiger - very slow indexing with spotlight?

Posted:
in macOS edited January 2014
I installed the Tiger Developer Preview on both, my PowerMac G5 Dual 2Ghz and my older iBook (G3/600, Radeon Graphics). While i erased the hard drive of my iBook before installing Tiger, i choosed the option "Install & Archive" on my G5. The iBook was done with indexing after a few minutes with the nearly empty hard disk after the first boot, the G5 is still indexing the drive after 13 hours uptime (150 GB used of 160GB).



Two processes are consuming CPU cycles, MDS (could be Meta Data Search) and MDImport (Meta Data Import).



Any expieriences if this is normal?
«1

Comments

  • Reply 1 of 27
    andersanders Posts: 6,523member
    Quote:

    Originally posted by Mendel

    I installed the Tiger Developer Preview on both, my PowerMac G5 Dual 2Ghz and my older iBook (G3/600, Radeon Graphics). While i erased the hard drive of my iBook before installing Tiger, i choosed the option "Install & Archive" on my G5. The iBook was done with indexing after a few minutes with the nearly empty hard disk after the first boot, the G5 is still indexing the drive after 13 hours uptime (150 GB used of 160GB).



    Two processes are consuming CPU cycles, MDS (could be Meta Data Search) and MDImport (Meta Data Import).



    Any expieriences if this is normal?




    For alfa software yes. Speed doesn´t really have any priority at this phase. As a developer you ought to know that.
  • Reply 2 of 27
    mendelmendel Posts: 5member
    I am not concerned about the overall performance of Tiger (which is actually very good!), only about the speed of the indexing technology!



    While sampling the "mdimporter" and "mds" tasks from activity monitor, i know there is activity and that these tasks are not hung. They are doing something but with very little disk activity.



    Would be great if somebody can post any insights!
  • Reply 3 of 27
    a_greera_greer Posts: 4,594member
    beta || developers preveiw || release candidate == BUGS;



    slow_index_on_upgrade == BUG;



    also, for OS updates, is it not standard to wipe the disk clean before installing, upgradeing OSs in my experiance results in a ton of bit rot, particularly in a development/multimedia enviornment wheere the user relies on hi proformance apps.
  • Reply 4 of 27
    mendelmendel Posts: 5member
    Lets hope the performance of the indexing is going to improve in the final release of tiger. The overall performance is excellent.



    Is someone else here running Tiger and has this problem?
  • Reply 5 of 27
    mendelmendel Posts: 5member
    Some more Information:



    I mentioned my freshly formatted iBook doesn't have the Problem with the long and slow indexing. I copied my music (5 GB) to my iBook and the same processes (mdimport and mds) launched and started indexing. Lets see how long the iBook takes for the MP3s and if this is a problem specific to the MP3 Meta Data Importer.
  • Reply 6 of 27
    faust9faust9 Posts: 1,335member
    I noticed something similar with Panther.



    Short story:



    I updated my 14" iBook a few weeks ago which included a video update. Said video update reaked havoc on my system causing many psychodelic colors and eventually the grey "hold button to reboot" failure box. I got a Kernel panic "unable to load module" or something like that and consequantly had to reload the OS. I got everything reloaded and attempted to index and was unable to do it. Indexing ran for 24+ hours and wasn't even 1/3 the way through my 60gig HD before I said "phooey!" and stopped indexing. I ended up reloading OS X last week because I've had a few bigs crop up since my little crash (xCode stopped working for all accounts even though I reloaded it fresh and new a couple of times among other things). I indexed my HD last night and it was done by the time I woke up this morning (5 hours later).



    It doesn't sound like its Tiger specific but rather a bug that's been around for some time that people haven't noticed (not a lot of people need to using indexing anyway).



    [edit] just wanted to add that I had about the same number and type of files between the two index attempts.
  • Reply 7 of 27
    mendelmendel Posts: 5member
    I think i've found the Problem!



    The iBook did the same thing as the G5, so i started deleting the MP3s copied over from the G5 one by one. It seems like one MP3 was defective, after deleting it, the indexing continued normally and finished in minutes.
  • Reply 8 of 27
    You must let Apple know - they may ask for that specific file, that way they can work around issues like this.
  • Reply 9 of 27
    a_greera_greer Posts: 4,594member
    Quote:

    Originally posted by henryblackman

    You must let Apple know - they may ask for that specific file, that way they can work around issues like this.



    and if..."bad karma" was involved in the aqisition of said mp3...plead the fifth...
  • Reply 10 of 27
    synsyn Posts: 329member
    Yeah, I've had a similar experience, with 300GB+ of data.



    The MD importers seem to shit themselves on files without proper extensions, and some MP3 files too. It just goes over and over again on the same files, you can witness this by issuing a



    sudo fs_usage



    in the terminal. It took me alot of grepping and looking at the logs to determine the faulty files and folders.



    I'm pretty sure this will be fixed by GM, it's a while from now.
  • Reply 11 of 27
    burkeyburkey Posts: 5member
    Initial indexing will always be slow. Basically every document on the entire system has to be opened, scanned (feature extraction), and have its information (vector and score) added to the master index. This is a laborious initial process, but once completed only newly added or modified documents will need to be processed which hardly take any time at all. Tiger *should* be doing much of the initial indexing during idle time or at least as a very low priority process.



    Incidentally, much of the search in Tiger is available today (just get info on a drive and select the "index now" button in the content index section. This will index all files and their contents, but not the contents of mail, address book, etc.



    Chad



    --

    Chad Burkey

    Chad Burkey Photography

    www.chadburkey.com
  • Reply 12 of 27
    placeboplacebo Posts: 5,767member
    Am I correct:



    Spotlight indexes everything initially, and then when files are modified, re-indexes them?
  • Reply 13 of 27
    a_greera_greer Posts: 4,594member
    Quote:

    Originally posted by burkey

    Initial indexing will always be slow. Basically every document on the entire system has to be opened, scanned (feature extraction), and have its information (vector and score) added to the master index. This is a laborious initial process, but once completed only newly added or modified documents will need to be processed which hardly take any time at all. Tiger *should* be doing much of the initial indexing during idle time or at least as a very low priority process.



    Incidentally, much of the search in Tiger is available today (just get info on a drive and select the "index now" button in the content index section. This will index all files and their contents, but not the contents of mail, address book, etc.



    Chad



    --

    Chad Burkey

    Chad Burkey Photography

    www.chadburkey.com




    Sounds like less of a proc intense job particularly on a g5, but the problem is the hdd, the hdds of today are pathetic, the tech of hdds has changed so little over the past 7-10 years that that is more often than not the bottle neck on data access.
  • Reply 14 of 27
    burkeyburkey Posts: 5member
    Quote:

    Originally posted by Placebo

    Am I correct:



    Spotlight indexes everything initially, and then when files are modified, re-indexes them?




    That is my understanding. It probably does reindexing on new or changed documents during idle times (at least that is the way I would design it).



    Chad



    --

    Chad Burkey

    Chad Burkey Photography

    www.chadburkey.com
  • Reply 15 of 27
    burkeyburkey Posts: 5member
    Quote:

    Originally posted by a_greer

    Sounds like less of a proc intense job particularly on a g5, but the problem is the hdd, the hdds of today are pathetic, the tech of hdds has changed so little over the past 7-10 years that that is more often than not the bottle neck on data access.



    While I agree with you in principle, it is still a fairly intense process. Indexing 400G of documents (most of which are photographs) takes more than a day with my dual G5 and an xserve raid. Building an index, essentially an n-dimensional matrix, can be a very complex process. There are things that can make this faster (such as using the GPU for the matrix calculations) or easier (such as using a database to store all the features extracted from the documents and allowing the dbms to handle the indexing) but in the end there is a potentially large amount of data that has to be processed and categorized. That is why incremental approaches (indexing an adding new documents to the index as they are created) are most always preferred to full index recreation when the corpus changes even though the results of the full recreation are always superior to the incremental results.



    Chad



    --

    Chad Burkey

    Chad Burkey Photography

    www.chadburkey.com
  • Reply 16 of 27
    chuckerchucker Posts: 5,089member
    Quote:

    Originally posted by burkey

    That is my understanding. It probably does reindexing on new or changed documents during idle times (at least that is the way I would design it).



    It tracks file creations, moves, changes, etc. and (re-)indexes the files after a short delay. It does not wait until the machine is idle.
  • Reply 17 of 27
    placeboplacebo Posts: 5,767member
    Can a Panther Tester tell me: Is there a preference having to do with spotlight? i.e., whether to index, etc.
  • Reply 18 of 27
    chuckerchucker Posts: 5,089member
    I have so far not found a single thing. The thing just works automatically, whether you want it or not - killing "mdimport" and "mds" will result in them launching again.



    You could remove the startup item, of course.



    There's no "exclude these directories" prefs, there's no query delay prefs, nothing. However, even on this rather low-end machine (iBook 700), the CPU usage is alright. Dashboard, on the other hand, obviously has some widgets that "leak"...



    Quote:

    Can a Panther Tester tell me



    Probably not, since Panther doesn't include Spotlight :P
  • Reply 19 of 27
    kim kap solkim kap sol Posts: 2,987member
    Here's some excellent Spotlight info from Daring Fireball's one and only John Gruber.
  • Reply 20 of 27
    Quote:

    Originally posted by Mendel

    I installed the Tiger Developer Preview on both, my PowerMac G5 Dual 2Ghz and my older iBook (G3/600, Radeon Graphics). While i erased the hard drive of my iBook before installing Tiger, i choosed the option "Install & Archive" on my G5. The iBook was done with indexing after a few minutes with the nearly empty hard disk after the first boot, the G5 is still indexing the drive after 13 hours uptime (150 GB used of 160GB).



    Two processes are consuming CPU cycles, MDS (could be Meta Data Search) and MDImport (Meta Data Import).



    Any expieriences if this is normal?




    Apologies in advance if this has already been asked or answered, butI'm too lazy to read the thread.



    Does Tiger use a new file system to that of Panther's, for the Spotlight stuff? I'm guessing no, but I thought I'd ask anyway. I'm thinking that Tiger puts invisible files, (much like .DS_Store), in directories, that may store the extra "meta-data", or something similar ? is this right?



    Thanks. m.
Sign In or Register to comment.