Help Cataloging A Personal Library/Archive?

Jump to First Reply
Posted:
in AppleOutsider edited January 2014
Hi all.



I have recently figured out that my filing system for my personal library/archive doesn't work. How do I know this? I couldn't figure out which of my four tubs of file folders an article was in.



Here's the skinny:



I'm an academic and I do a large amount of work with sources from the 19th century. Lots of these are photocopies. Some are my notes. I need to figure out a way to catalog what I have and store it so I can retrieve it when I need it. For my larger documents, I use clipless binders with a title on the spine indicating what's in them.



But I've got hundreds of other, loose, bits in hanging folders. I want to change that.



I'm wondering if anyone has any experience trying to organize/catalog something like this. I do NOT want to use an electronic database. I do NOT want to scan anything. I just want to catalog, index and file. At the moment, I'm thinking of just doing this in 3-ring binders with plastic sleeves to hold the documents and then some kind of coding system (a number of the binder+a document number) and a plain excel sheet indexing everything.



The problem is that this needs to be scalable; ideally, I'll keep this system going for many years and will add to it, at times, rapidly. Does anyone have any advice? Experience?



Thanks in advance.
«1

Comments

  • Reply 1 of 26
    You have my sympathies. I probably don't have much advice to offer, but I do have to maintain archive files for my accounting office and what we do is use bankers boxes with hanging file folders. The boxes are numbered and labeled with their contents. The boxes are created by type and then filed alphabetically. There is a database that matches the boxes. I used to work for a union and I was able to quickly pull files from the forties and fifties using this system. When using binders I find it preferable if possible to photo copy everything to standard size, but plastic sleeves could work. My two biggest concerns with using binders are space and cost. Of course I am in the process of trying to get as much of this material stored as pdf's as possible which is a whole other set of problems!
     0Likes 0Dislikes 0Informatives
  • Reply 2 of 26
    midwintermidwinter Posts: 10,060member
    Thanks, TF. It looks like what I'm going to have to do is create my own cataloging system. I would, at some point, turn these into PDFs, but frankly it's easier just to deal with the hard copies right now--and for the foreseeable future.
     0Likes 0Dislikes 0Informatives
  • Reply 3 of 26
    dmzdmz Posts: 5,775member
    To hell with that midwinter -- GIGO!! Get some clean scans and blow your stuff through either OmniPage Pro or ABBYY FineReader.



    In fact you could go a long, long way towards your goal by turning a copy of Acrobat Professional loose on your scans (pro will do OCR). It wouldn't be 100%, (it would catch 90% of the verbage) but you could still create Acrobat 'indexes' of your docs, and search them on the fly.



    Don't be a ninny!!



    It's either that or Excel Hell.
     0Likes 0Dislikes 0Informatives
  • Reply 4 of 26
    midwintermidwinter Posts: 10,060member
    DMZ: I understand, but OCR isn't really an option. I need to retain page numbers on documents that were printed. I have lots of documents that are hand-written (e.g. the diary of a clergyman from 1848, handwritten letters). And getting a clean scan? Impossible for large chunks of the time because I'm working either with original documents that were in bad shape to begin with or with microfiche/film xeroxes.



    This isn't really about having everything available at the click of a mouse. It's about being able to find some essay when I find myself, like today, trying to find something for a footnote. There's also a lot of memory of the shape of pages and where information is on a page that would be lost if they were OCR'd, if that makes sense (I often remember that something I want to remember was printed on a certain column where the page looked a certain way).
     0Likes 0Dislikes 0Informatives
  • Reply 5 of 26
    addaboxaddabox Posts: 12,665member
    Quote:
    Originally Posted by midwinter View Post


    DMZ: I understand, but OCR isn't really an option. I need to retain page numbers on documents that were printed. I have lots of documents that are hand-written (e.g. the diary of a clergyman from 1848, handwritten letters). And getting a clean scan? Impossible for large chunks of the time because I'm working either with original documents that were in bad shape to begin with or with microfiche/film xeroxes.



    This isn't really about having everything available at the click of a mouse. It's about being able to find some essay when I find myself, like today, trying to find something for a footnote. There's also a lot of memory of the shape of pages and where information is on a page that would be lost if they were OCR'd, if that makes sense (I often remember that something I want to remember was printed on a certain column where the page looked a certain way).



    Aha, this is where you and that Baker essay on card catalogues intersect.
     0Likes 0Dislikes 0Informatives
  • Reply 6 of 26
    midwintermidwinter Posts: 10,060member
    Quote:
    Originally Posted by addabox View Post


    Aha, this is where you and that Baker essay on card catalogues intersect.



    Hehe. Well, only if I'm burning them!
     0Likes 0Dislikes 0Informatives
  • Reply 7 of 26
    dmzdmz Posts: 5,775member
    Quote:
    Originally Posted by midwinter View Post


    DMZ: I understand, but OCR isn't really an option. I need to retain page numbers on documents that were printed. I have lots of documents that are hand-written (e.g. the diary of a clergyman from 1848, handwritten letters). And getting a clean scan? Impossible for large chunks of the time because I'm working either with original documents that were in bad shape to begin with or with microfiche/film xeroxes.



    This isn't really about having everything available at the click of a mouse. It's about being able to find some essay when I find myself, like today, trying to find something for a footnote. There's also a lot of memory of the shape of pages and where information is on a page that would be lost if they were OCR'd, if that makes sense (I often remember that something I want to remember was printed on a certain column where the page looked a certain way).



    hmmmm... Acrobat Pro (especially) would keep things nailed down on the page where they originated, and it's possible to number pages -- 'paste' down headers, etc. Acrobat makes a point of keeping the xy coordinates of objects the same -- to a fault. Also, don't be too afraid of the microfiche representations -- a lot of the programs now boast working from screen captures and faxes -- even digital cameras. Abbyy's documentation seems to be keen on that -- 'just take a picture....'



    But for handwritten docs or things that you can't scan at all, or large jobs you couldn't pawn off on the University copy shop, I dunno.



    Wasn't there something.. Delicious library??...



    Edit: check this out:



    http://www.delicious-monster.com/
     0Likes 0Dislikes 0Informatives
  • Reply 8 of 26
    midwintermidwinter Posts: 10,060member
    Quote:
    Originally Posted by dmz View Post


    hmmmm... Acrobat Pro (especially) would keep things nailed down on the page where they originated, and it's possible to number pages -- 'paste' down headers, etc. Acrobat makes a point of keeping the xy coordinates of objects the same -- to a fault. Also, don't be too afraid of the microfiche representations -- a lot of the programs now boast working from screen captures and faxes -- even digital cameras. Abbyy's documentation seems to be keen on that -- 'just take a picture....'



    But for handwritten docs or things that you can't scan at all, or large jobs you couldn't pawn off on the University copy shop, I dunno.



    Wasn't there something.. Delicious library??...



    Believe me. I understand how all of this works and how cool it all is. But most of my documents are scanned with a 2-page layout in landscape. They have marginal notes and underlining that fuck up OCR. Lots of them are mid-90s fax quality. But again, I don't really want PDF. It would take me a year to scan it all. It would take me forever to tag it all in Yojimbo (which I use as my research junk drawer). Right now I just want to catalog, file and index so I know where it all is the next time I'm looking for WR Greg's essay "Why Are Women Redundant" and can't find it.
     0Likes 0Dislikes 0Informatives
  • Reply 9 of 26
    trick falltrick fall Posts: 1,271member
    I'm sure there is also a certain level of enjoyment going through all of those bits of paper. Just keep in mind you will need shelving for all those binders.
     0Likes 0Dislikes 0Informatives
  • Reply 10 of 26
    Quote:
    Originally Posted by midwinter View Post


    Believe me. I understand how all of this works and how cool it all is. . . . .



    I'm not sure about that. I bought a Fujitsu ScanSnap Pro a few months ago, and it has been life-altering. It scans full bleed, duplex, and it's sheet-fed. It eat through stacks of paper faster than I can shred them, and dumps everything into PDF straight away. I'm not sure if it does OCR or not.



    If you're still hell-bent on non-digital organization, I would agree that three-ring binder are the way to go. I keep all of my documents in three ring binders in addition to the scanned digital copies. The only other suitable option is hanging folders in a file cabinet, but they aren't as portable.
     0Likes 0Dislikes 0Informatives
  • Reply 11 of 26
    dmzdmz Posts: 5,775member
    Quote:
    Originally Posted by midwinter View Post


    Believe me. I understand how all of this works and how cool it all is. But most of my documents are scanned with a 2-page layout in landscape. They have marginal notes and underlining that fuck up OCR. Lots of them are mid-90s fax quality. But again, I don't really want PDF. It would take me a year to scan it all. It would take me forever to tag it all in Yojimbo (which I use as my research junk drawer). Right now I just want to catalog, file and index so I know where it all is the next time I'm looking for WR Greg's essay "Why Are Women Redundant" and can't find it.



    You can tell the OCR to look for facing pages -- and the fax quality isn't all that big a deal. Maybe the thing isn't to try to go all one way or the other; you could scan what will represent well and then manually archive the stuff that was too funky. Divide and conquer?



    What about delicious?
     0Likes 0Dislikes 0Informatives
  • Reply 12 of 26
    groveratgroverat Posts: 10,872member
    Divide and conquer is definitely my suggestion as well. Scan (or get someone else to scan) what is suitable for scanning and physically store the rest.



    Your organization can (should?) be all digital, even if you keep the physical documents, your reference guide to find where they are can (should?) be digital, for easy updating and easy searching. After that it is simply a matter of pointing you in the right direction (digital or physical storage).



    I would say 90% of my stuff is electronic, but I still have binders with plastic sleeves for a lot of stuff that might not lend itself well to OCR. I try to avoid that where possible, because in-document search mechanisms (Spotlight on Mac and the Windows equivalents) are invaluable for information searching.



    And if you are planning on building up a wealth of stuff until you die, digital is ideal (as you know I am sure).



    I would also suggest talking to local print shops that might have digitization services. Call or e-mail and describe what you have and maybe you could pay someone to make it all nice and sexy for you.
     0Likes 0Dislikes 0Informatives
  • Reply 13 of 26
    midwintermidwinter Posts: 10,060member
    Quote:
    Originally Posted by dmz View Post


    You can tell the OCR to look for facing pages -- and the fax quality isn't all that big a deal. Maybe the thing isn't to try to go all one way or the other; you could scan what will represent well and then manually archive the stuff that was too funky. Divide and conquer?



    What about delicious?



    I do keep a digital archive of some things, mostly in Yojimbo. And I do keep digital versions of my notes, all of which are synced up among my three Macs and kept in RTF so they won't be inaccessible 10 years from now.



    Delicious, in my experience, is too slow on my computers and is in the end designed to do something else.
     0Likes 0Dislikes 0Informatives
  • Reply 14 of 26
    dmzdmz Posts: 5,775member
    Quote:
    Originally Posted by midwinter View Post


    I do keep a digital archive of some things, mostly in Yojimbo. And I do keep digital versions of my notes, all of which are synced up among my three Macs and kept in RTF so they won't be inaccessible 10 years from now.



    Delicious, in my experience, is too slow on my computers and is in the end designed to do something else.



    You sir, are a bona fide Luddite and an Adobephobe.



    Shame. Shaaaame!
     0Likes 0Dislikes 0Informatives
  • Reply 15 of 26
    midwintermidwinter Posts: 10,060member
    Quote:
    Originally Posted by dmz View Post


    You sir, are a bona fide Luddite and an Adobephobe.



    Shame. Shaaaame!



    Um, I'm an English professor! What did you expect!?!



    And I am NOT a Luddite. I have never even BEEN to Manchester in 1826, nor would I have smashed any looms when I was there.



    But yes, there are some things I do not like to use technology for.
     0Likes 0Dislikes 0Informatives
  • Reply 16 of 26
    kickahakickaha Posts: 8,760member
    My wife is in a similar boat, and did the OCR/scanning route for a while - too much work for too little benefit.



    You may consider getting BibDesk. It's a LaTeX/BibTeX reference manager that you could use to manage all the keywords, etc, and then have a single entry for which file folder it's in. Heck, number the folders sequentially - they don't have to have any sense to them at all, other than just partitioning down the number of documents you need to flip through.



    The entry format is completely extensible, so you can add whatever metadata you want.
     0Likes 0Dislikes 0Informatives
  • Reply 17 of 26
    dmzdmz Posts: 5,775member
    Quote:
    Originally Posted by Kickaha View Post


    My wife is in a similar boat, and did the OCR/scanning route for a while - too much work for too little benefit.



    I do think that the OCR route would only be to 'bomb in' your documents; if you caught 80-90% of the words that would probably go long way towards locating something. But if you try to to any correction to the OCR, especially if it is in the 80-90% range -- it's death of a thousands typos.
     0Likes 0Dislikes 0Informatives
  • Reply 18 of 26
    midwintermidwinter Posts: 10,060member
    Quote:
    Originally Posted by dmz View Post


    I do think that the OCR route would only be to 'bomb in' your documents; if you caught 80-90% of the words that would probably go long way towards locating something. But if you try to to any correction to the OCR, especially if it is in the 80-90% range -- it's death of a thousands typos.



    I'm very, very reluctant to do anything that is guaranteed to introduce error into this.
     0Likes 0Dislikes 0Informatives
  • Reply 19 of 26
    midwintermidwinter Posts: 10,060member
    Quote:
    Originally Posted by Kickaha View Post


    My wife is in a similar boat, and did the OCR/scanning route for a while - too much work for too little benefit.



    You may consider getting BibDesk. It's a LaTeX/BibTeX reference manager that you could use to manage all the keywords, etc, and then have a single entry for which file folder it's in. Heck, number the folders sequentially - they don't have to have any sense to them at all, other than just partitioning down the number of documents you need to flip through.



    The entry format is completely extensible, so you can add whatever metadata you want.



    I played around with it for a while, and I felt like i was trying to use a combine to mow my yard. Right now, I can maintain an Excel list with some tagging information and combine that with Yojimbo to incrementally scan and store as PDF).
     0Likes 0Dislikes 0Informatives
  • Reply 20 of 26
    dmzdmz Posts: 5,775member
    Quote:
    Originally Posted by midwinter View Post


    I'm very, very reluctant to do anything that is guaranteed to introduce error into this.



    Dammit midwinter they're just documents! There's nothing outside of the text!
     0Likes 0Dislikes 0Informatives
Sign In or Register to comment.