Update on 2013-02-16: European/German Rules Added
Intro
I have always liked the Hazel app. I thought of it as a nice little tool, which does nice little things. Since I have it (maybe since two years now), I made some rules, which should help me a bit with my daily computer life.
I hadn't taken the time so study Hazel, so my rules were quite simple, like:
- Move Screen Shots from the desktop to a screen shot folder,
- Set a color label to an old file,
- Import photos from a specific folder into Aperture,
and similar.
Meanwhile I've gotten older. And the older I get, the more (paper) mails I get. This are letters from companies, from insurances, from my employer, from my school and so on. In addition, I am used to collect receipts from everywhere. Who knows? Maybe I'll need them some day? ;-P
Because I like Macs, and because I like reading, I follow a lot of people's RSS feeds, which are often about Mac and Apple stuff. So it happened that I saw the word "paperless" more and more. Some of the RSS guys recommended David Sparks "Paperless" book. I bought it and saw some interesting tips and tricks in there. I liked the idea of a fully automated paperless system, where you scan a sheet of paper, and the computer does the rest. The book describes how to achieve this. But there was an action in this whole process, which hasn't satisfied me:
In the book: After a document is scanned, it got OCR'd. From there on it is on Hazel to proceed with the file. The file gets a new name based on the content of the document, a date in the file name, and it gets sorted into a specific folder. Everything works fine, but sometimes the 'add time stamp to the file name'-part isn't what you want, because it just writes when the file was added to the computer into the file name, which basically is the same date, when the document has been scanned or downloaded.
There are situations when this is okay, like if you scan a receipt on the same day when you bought something. But I have different habits: I collect the receipts, mails and documents and scan them once in a while. So with the method from "Paperless" all the files will have the date in their file names on which they where scanned, and not the one that the physical paper version was created.
But this was what I wanted. I wanted something, that would add the date to the file name, on which the paper version was created (in other words: I wanted to extract the date from the documents content), so that the documents could be stored in an accurate nested folder structure.
I did some internet research and I've found some ideas of some smart people, like some Python scripts, a Mac-build-in command line UNIX tool and stuff like this. But I couldn't get the things to work, or they weren't supported on the current OS X version or so.
And this was the moment I decided to make an own Hazel rule, which should extract the date from the file's content and add it to the file name.
General Stuff about the Rules
I made Hazel rules for dates with the two formats: - Gregorian little-endian, starting with day (Day-Month-Year, Europe) - Middle-endian, starting with month (Month-Day-Year, U.S./English)
In addition both versions support the Gregorian big-endian, starting with year (Year-Month-Day) format.
Supported European/Germen formats in detail:
The rules support the following date formats (I'll write down the formats with the example '15. Januar 2013', so that it will be clear which date formats are supported. But that doesn't mean, that only this one date will work; instead all dates from 01.01.2000 to 31.12.2013 will work. Be aware: I have added only the non-digit-but-text-month-names in German. Non-German people will need to adjust the written month names. Or just delete them; this will make the rules faster, but you'll loose some functionality.):
- '15.06.2013'
- '15.06.13'
- '15-06-2013'
- '15-06-13'
- '15/06/2013'
- '15/03/13'
- '2013.06.15'
- '2013-06-15'
- '15. Juni 2013'
- '15. Juni 13'
- '15. Jun. 2013'
- '15. Jun. 13'
The output format is always and in both, the U.S./English and European version, YYYY-MM-DD.
Supported U.S./English formats in detail:
The rules support the following date formats (I'll write down the formats with the example 'June 15, 2013', so that it will be clear which date formats are supported. But that doesn't mean, that only this one date will work; instead all dates from 01/01/2000 to 12/31/2013 will work):
- '6/15/2013'
- '06/15/2013'
- '6/15/13'
- '06/15/13'
- '6.15.2013'
- '06.15.2013'
- '6.15.13'
- '06.15.13'
- '6-15-2013'
- '06-15-2013'
- '6-15-13'
- '06-15-13'
- '2013-06-15'
- 'June 15, 2013' (this will work for single and double digit day dates)
- 'June 15, 2013' (this will work for single and double digit day dates)
- 'Jun. 15, 13' (this will work for single and double digit day dates)
- 'Jun. 15, 2013' (this will work for single and double digit day dates)
The output format is always and in both, the U.S./English and European version, YYYY-MM-DD.
How to Set Up the Rules
What you need: An OCR software, Hazel and the rules :-).
Next you'll need to create some folders an your Mac. I have chosen my 'Action' folder in my Dropbox to get this task done. You can use any folder you like. But if you'll use my rules, you'll have to make some adjustments to where the files shell be moved after each step. So if you don't want to have additional work with the rules, I recommend you to make the same folder structure that I have.f) and import the rules separately to their matching folder. Then just start adding your files.
My folder structure looks like this:
- ~/Dropbox/
- ~/Dropbox/Action/
- ~/Dropbox/Action/Add Information/
- ~/Dropbox/Action/Add Information/1.1 Add Day
- ~/Dropbox/Action/Add Information/1.2 Add Month
- ~/Dropbox/Action/Add Information/1.3 Add Year
- ~/Dropbox/Action/Add Information/2.0 Add Context
- ~/Dropbox/Action/Add Information/3.0 Add Location
- ~/Dropbox/Action/Add Information/4.0 Add Person
- ~/Dropbox/Action/Add Information/5.0 Check
Next: Just download the rules (or create them by yourself) and import the rules separately to their matching folder. Then just start adding your files :-).

The rules
You can build the rules by yourself. Here are some screenshots:
European/German Add Day
Below are screenshots of the first out of 31 rules for extracting the day part of the date.
European/German Add Month
Below are screenshots of the first out of 12 rules for extracting the month part of the date.
European/German Add Year
Below are screenshots of the rules for the year 2013.
U.S./English Add Day
Below are screenshots of the first out of 31 rules for extracting the day part of the date.
U.S./English Add Month
Below are screenshots of the first out of 12 rules for extracting the month part of the date.
U.S./English Year
Below are screenshots of the rules for the year 2013.
Add Context, Add Location, Add Person, Fallback (in Add Day folder)
Below are screenshots of some sample rules. for adding a context, a location, a person and a fallback rule.
How it Works
You’ll need an OCR’d file with one date in the content. I have PDFPenPro to do the OCR. On the Mac I scan an OCR the files manually, because I only have a flatbed scanner. If I want to add a file (by saying file I mean e.g. a receipt) via my iPhone, I use Scanner Pro. It uploads the file to another folder in my Action folder in my Dropbox; then a Hazel rule starts an automatic PDFPenPro OCR.
The next step is to put the file into the 1.1 Add Day folder. The Hazel rule there keeps the original file name (so that if you’ll add multiple files at once with the same date in their content, there will be no conflicts). Hazel then looks for the day part of the date. If it finds one, it’ll add the day date pattern to the file name and move the file to the next folder, which is 1.2 Add Month. There the rule looks for the month pattern of the date. If it find’s one, it’ll add it to the file name and pass the file to the folder 1.3 Add Year, where the date is added. After this the files is moved to 2.0 Add Context. There the original file name part is removed and Hazel looks for a matching context, like water bill, cellular bill or so. The downloadable rules contain a rule for the water bill. You can copy and adjust it for every context you’ll need. The folders 3.0 Add Location and 4.0 Add Person have similar rules like Add context, but the rules there add a location and/or a person. If You don’t need them, just stop after Add context. The final folder is 5.0 Check. You can set up some rules there which will do what you want. I use this folder to check, if everything went alright.
If you want, you can set up a fall back rule to each of the folders: This rule starts after a predefined time (my default is 15-30 minutes after a file has been added to a folder) and moves the file to the 5.0 Check folder, so you’ll know that there went something wrong. You need to aadd the fall back rules to the last position of the folders.
Note: the rules are very long. That means that the whole renaming process can take some time. Hazel has to think very hard and that can cause a notable CPU load for the time processing. On my Late 2009 iMac a renaming takes 15-60 seconds and causes a CPU load about 30 %. The longer a file is or the more text it contains, the longer the renaming process takes. The processing time depends on the date, too. Because Hazel rules run from top to bottom, an early month of each year is detected faster then a late one. A late day of a month is detected faster then an early one (I sorted the day date order this way, don’t ask for a reason, because there is none ;-) ), and the current year is detected faster then the year 2000. Right know the fastest date to detect is 01/31/2013 and the slowest is 12/01/2000. You can change the order of the rules to your liking if you want :-).
Known Issues:
- The rules work only reliable, when there is only one date in the file’s content. If there is more then one date, the rules can mix up and you can get a improper renamed file (Hazel runs the rules from top to bottom. So if the file’s content has to dates, lets say 01/30/2012 and 02/02/2013, the result will be 2012-01-02).
- It is possible that a file gets doubled in the whole renaming process. I’m not sure why this happens.
- In some rare cases when I tested the rules a file disappeared from the folders. Hazel ate it :-P. So maybe it’s better to work with copies and to delete the unnamed files only after the files landed in the 5.0 Check folder.
- There may occur improper renaming if there are some other numbers in the file content, which are similar to dates. The rules are so long, because I wanted to minimize wrong date recognitions, but there may be dates where it still happens (e.g. With a date pattern matching Val #).
- Sometimes the OCR software does a bad job and adds some white spaces in between the OCR’d date text, like 01 / 01 / 13 instead of 01/01/13. The rules won’t find such dates.
- Dates before the year 2000 won’t work, because that would make the rules even longer and much slower.
- I googled for some american receipts to test them. I’ve found some (they are in the Test Files folder) which cause an improper renaming, and some which work well. Just test them yourself if you want.
NOTE 1: Not for a 100 %, but almost, I'm sure that the 'disappearing file' and 'doubling file' issues are caused by Dropbox. Because the Hazel rules do a lot of things in a short time - moving the files around and renaming them - there maybe some versioning problems. So to minimize the risk that the problems happen, I recommend to not put the 'Add Date Rules' into the Dropbox folder. Just set up the folders with the rule in a different place, like the regular 'Documents' folder. If you still need to rely on Dropbox for uploading and accessing the files, you can add additional Hazel rules which will move the uploaded files that you want the 'Add Date Rules' to append, and others which will move the renamed files from the '5.0 Check' folder to their destination, which can be Dropbox again.
Note 2: There is a possibility to deal with the problem of false-positve matched dates if files always look the same. Before starting the 'Add Date' rules a file could be put into another folder, which looks for a specific keyword (which can only be found in this files. If the keyword is not there, the file will be moved to the 'Add Date' folders. But if the keyword is there, the file will be moved into a specially prepared Hazel-rules-folder, which will run other rules. The rules could be the same as the other 'Add Date' rules, but you could delete all the unnecessary or error-cuasing rule-entries.
BUT: If you want to move the folders to a different place, first import the rules to a folder structure like I did it. If you won't, you'll need to adjust the 'Move To...' part of every single rule. But if you import the rules to my folder structure and then move (or even rename), Hazel will track the changes and do the work for you.
Get the Rules
So finally, here are the rules.
It took me very long to set them up and to test them. That is why I decided to make the rules freemium. That means that you can download a basic set of rules, which will only detect files with the date 01/01/2013 (but in all possible styles mentioned above). All you have to do, is to copy the rules and adjust them. But I warn you ;-), you’ll need to do 30 adjustments for the Add Day rule, 11 for the Add Month rule, and up to 12 for the Add Year rule. Each of the adjustments has a plenty of single entries which have to be changed. The freemium version has no fallback, add location and add person rules, and no test files either,
The other option is to donate me 4$ for the U.S./English Rules, 3€ for the European/German Rules or for the European/UK Rules (donation via PayPal, sorry for that, but this is the only cost efficent way for paymets) and to get all the rules which will work from 01/01/2000 to 12/31/2013 (European: 01.01.2000 to 31.12.2013). By getting the donation rules you’ll not only save a plenty of time, but you’ll support me with my add-free website and you’ll motivate me to start similar complex projects in the future :-)
Note: In the end of each year I’m planning to release the Add Year rule for the upcoming year for free.
European/German Rules and European/UK
To get the complete European/German donation rules, click the donate button below (on a Mac, because iOS devices don't support downloading). The download starts directly after the PayPal check-out. If there appears a problem during the download process, please contact me at contact[at]macorios.com.
Note: Non-German or Non UK-English languages will need to adjust the text-month-names (or delete them for speed improvement, but feature loss).
Full European/German Rules (3€):
Full European/UK Rules (£2.70):
U.S./English Rules
To get the complete U.S./English donation rules, click the donate button below (on a Mac, because iOS devices don't support downloading). The download starts directly after the PayPal check-out. If there appears a problem during the download process, please contact me at contact[at]macorios.com.
Full U.S./English Rules (4$):
Feel free to leave a comment, or to subscribe to my RSS feed :-)