Find by content in Jaguar...doesn't!

Posted:
in macOS edited January 2014
Anyone else have this problem? I try to search for MS Excel files by content, and even when I search on files I KNOW exist, they don't show up. I tried deleting the index and reindexing, but still no luck.



Is 10.2 just bad about that?





Fish

Comments

  • Reply 1 of 9
    kickahakickaha Posts: 8,760member
    No, Excel is.



    Excel's format is proprietary, and unreadable by anything but Excel, in general.



    If you type 'Quarterly Budget' in Excel, it may appear in your file as Fjkdfs7673&$%#a. Without writing an Excel format parser, index search isn't going to know what the heck to do with this.



    This is why we all love XML files, isn't it boys and girls?



    I just took a look at a simple simple Excel sheet at the command line, and while some of the text shows up clearly, other does not, and writing a parser to pick out what's text and what's binary data would be difficult. Not impossible, but not exactly what I'd call a prime concern for Apple right now.



    Sounds like a great 3rd party opportunity... (Speaking of, does anyone know if indexing supports file format plugins?)
  • Reply 2 of 9
    amorphamorph Posts: 7,112member
    Also, if the file's big, indexing won't catch all of it. It only looks in the first few thousand characters of each file (at least as of 10.1 - I haven't tried it in Jaguar), so if the string you're looking for is at the end of a long document, good luck.



    Plug-ins for content indexing is a really good idea. I don't think it's currently supported, though.
  • Reply 3 of 9
    Thanks, but here is the tricky part....OS 9 finder finds the files just fine (if I reboot in 9). AND it finds them way way faster.



    Also, while there is a 2000 unique term limit per document in X, these documents I am looking for have probably 150 or so unique terms, so I don't think that is the problem.



    I suppose I will just boot up in 9 when I need to do this, (it actually takes less time to reboot twice and do the search than to search in X, which doesn't find it anyway!). Dang.





    Fish
  • Reply 4 of 9
    kickahakickaha Posts: 8,760member
    Weird! I don't have a particularly slow search by content on my machine... you said you deleted and re-indexed as well? Wacky.



    Sounds like they may have, er, not quite fleshed out the file parsers under X. Bugger.



    Hmmm. Call me insane, but could you just see a suite of file parsers used not only for find-by-content but also for file format translation, as plugins, extensible by third parties, as a Service? *shudder* Okay, I'm better now.
  • Reply 5 of 9
    cowerdcowerd Posts: 579member
    [quote]This is why we all love XML files, isn't it boys and girls?<hr></blockquote>

    No, Office XP file formats are supposed to be based on XML, and that's just binary goo embedded in an XML schema. Go MS&gt;
  • Reply 6 of 9
    Well, I think I MAY have found the problem, but I am not sure how to get around it.



    I did delete the index file, but although I thought I re-indexed.....when I do a get info, it lists the status as "indexing", but the progress bar shows zero change (in fact, shows nothing). And clicking "stop indexing" has no effect. I have tried rebooting, running disk utilities, but no luck.



    Anyone?





    Fish
  • Reply 7 of 9
    kickahakickaha Posts: 8,760member
    [quote]Originally posted by cowerd:

    <strong>

    No, Office XP file formats are supposed to be based on XML, and that's just binary goo embedded in an XML schema. Go MS&gt;</strong><hr></blockquote>



    XML doesn't prevent you from being a poor team player, but it does make it much easier to be a good one. That's why we like it.



    M$ will, of course, always figure out a way to screw it up, guaranteed.



    fishdoc: I'll take a look around for a CLI indexing tool... anyone out there know of one already?



    [ 12-18-2002: Message edited by: Kickaha ]</p>
  • Reply 8 of 9
    You might want to check /System/Library/Find and see if the file type/extension for excel files is/isn't in the the StopExts or StopTypes files
  • Reply 9 of 9
    Got this from the Apple boards. This works, but you MAY get the same error next time anyway...



    The following information is from a note I sent to an Apple Senior Tech, who later informed me that Apple has acknowledged this as a bug they expect to fix in a future release. The note also contains rough instructions for what is needed to fix this so you can get a good index for content search.



    There is a bug in ContentIndexing, the utility used to build an index of folders and volumes. If ContentIndexing encounters a file which it believes is corrupted, it crashes. This results in several issues:



    The crash can leave three or four files at the root level of the hard disk, volume, or folder upon which the indexing job was launched. These files are:



    .FBCIndex

    .FBCIndexCopy

    .FBCLockFolder (a directory/folder) which contains the last of the files,

    .FBCSemaphoreFile



    These files are invisible. The existence of these files on the hard disk, volume, or folder on which the indexing process failed results in the inability to perform another indexing operation on the same disk/volume/or folder. Sometimes it takes two failed Content Index operations in a row to produce this problem, other times just one is enough.



    The user knows they have this problem when the Get Info/Content Index pane of the affected disk/volume/folder shows the following information:



    Status: Indexing...

    Date: --

    ============= (a grey progress bar without any progress color shown)

    [Stop Indexing] button and [Delete Index] button



    and where:



    - The Stop Indexing button does nothing when selected.

    - The Delete Index button is unavailable, i.e. greyed out.

    - Despite the Status: Indexing info, no ContentIndexing process is running (checking TOP and Process Viewer).



    The user must then Find (Apple+F) these files by searching the affected disk/volume/folder for invisible files (Find criterion "Visbility &gt; off") that also met the criterion of "File name &gt; begins with &gt; .FBC"



    These can then be moved to the Trash with Apple+Delete. However, to remove them from the Trash, the user must employ Terminal and use rm and rmdir (or rm -rf) commands to delete them.



    Once the three or four .FBC* files are trashed/deleted, Get Info / Content Indexing displays the standard information for creating a new Index (as shown in Mac Help) and a new indexing job can be created.



    The requested fix is for ContentIndexing to clean up these files after a crash and to provide the user with the identity of the file which, when indexed, caused the crash. The user could then remove this file and indexing could proceed. Another option would be for the indexing job to proceed without crashing and tell the user which files were omitted because of potential corruption.
Sign In or Register to comment.