Using AppleScript to Automate Notetaking

On a recent episode of Gabe Weatherhead’s Generational podcast he spoke with Walton Jones, professor of Behavioral Neurobiology at Korea Advanced Institute of Science and Technology. They talk about Professor Jones’ system for annotating and summarizing academic papers about twenty minutes into the podcast. He’s further detailed his academic workflow on his blog, so be sure to give his explanation a read.

I’ve noted before how I manage my PDFs using the filesystem and Open Meta tagging. I’ve tended to maintain my notes in plain text written directly into DEVONthink, but after listening to Weatherhead’s talk with Jones and reading his post I’ve decided to adopt part of his system.

As a scientist Jones spends much of his time synthesizing the latest research that normally comes to him as a PDF from journals. Where I became interested in his system was 1) his color coded annotations and 2) his method of extracting those annotations to plain text. His system uses colors for different notes, green for references, red for summaries, and so on. Where the system really inspired me was his AppleScript that can process the PDF he has marked up (either in Skim or iAnnotate) that scans the PDF and extracts notes based on his categorization using Markdown syntax. He then dumps the notes into Voodoo Pad. Be sure to read his explanation of his system as my summary doesn’t do it complete justice.

The system relies on an AppleScript that looks for annotations in the PDF and extracts the text into Markdown-formatted plain text. I modified the script slightly for my own needs:

(*

Script courtesy of Walton Jones, modified slightly 
http://drosophiliac.com/2012/09/an-academic-notetaking-workflow.html

Original script by John Sidiropoulos
http://www.organognosi.com/export-skim-notes-according-to-their-highlight-colors/

*)

tell application "Skim"
    set the clipboard to ""
    set numberOfPages to count pages of document 1
    
    activate
    set myColorCodes to my chooseColor()
    
    set firstPage to "1" as number
    set lastPage to numberOfPages
    set the clipboard to "# Notes #" & return & return
    
    repeat with currentPage from firstPage to lastPage
        set pageNotes to notes of page currentPage of document 1
        exportPageNotes(pageNotes, currentPage, myColorCodes) of me
    end repeat
    
end tell

on exportPageNotes(listOfNotes, pageForProcessing, myColorCodes)
    tell application "Skim"
        
        set currentPDFpage to pageForProcessing
        repeat with coloredNote in listOfNotes
            
            repeat with i from 1 to the count of myColorCodes
                if color of coloredNote is item i of myColorCodes then
                    set categoryColors to ({"Summary", "Methods", "Arguments", "Reference", "Thesis", "Question or connection"})
                    set noteColor to color of coloredNote as string
                    if noteColor is item i of myColorCodes as string then
                        set noteColor to item i of categoryColors
                    end if
                    set noteText to get text for coloredNote
                    set the clipboard to (the clipboard) & "**[" & noteColor & "]" & "(" & name of document 1 & "#page=" & pageForProcessing & ")**" & ":   " & return & noteText & return & return
                end if
            end repeat
        end repeat
        
    end tell
end exportPageNotes

on chooseColor()
    set selectedColors to ({"Summary", "Methods", "Arguments", "Reference", "Thesis", "Question or connection"})
    set colorCodes to {}
    set noteColor to ""
    repeat with noteCol in selectedColors
        set noteColor to noteCol as text
        if noteColor is "Summary" then
            set end of colorCodes to {64634, 900, 1905, 65535}
        else if noteColor is "Methods" then
            set end of colorCodes to {64907, 32785, 2154, 65535}
        else if noteColor is "Arguments" then
            set end of colorCodes to {65535, 65531, 2689, 65535}
        else if noteColor is "Reference" then
            set end of colorCodes to {8608, 65514, 1753, 65535}
        else if noteColor is "Thesis" then
            set end of colorCodes to {8372, 65519, 65472, 65535}
        else if noteColor is "Question or connection" then
            set end of colorCodes to {64587, 1044, 65481, 65535}
            
        end if
    end repeat
    
    return colorCodes
end chooseColor

UPDATE 11/29/12
I was remiss in pointing out that the original AppleScript adapted by Walton Jones came from John Sidiropoulos at his blog OrganoGnosi. His blog has lots of advice on using digital tools for academic research.

I take my notes in Skim, which would result in something like:

Skim notes

When the script is run on a PDF, it results in a note formatted in Markdown that looks similar to this:

# Notes #

**[Reference](file://example.pdf#page=3**:
Reference text would appear here extracted automatically from the PDF.

That’s where the other half of the magic comes in Jones’s system. The note not only includes the text I wanted but also a hyperlink to the page of a particular reference. Transformed into Markdown, the note allows me to click on the reference and be taken back to the source. My notes use to appear similarly, often taking a form such as:

[3] Noting the page number in brackets followed by my notes, thoughts, direct quotes, and so on from a PDF or book.

As I mentioned, my notes were previously entered directly into DEVONthink. But with this new system I’ll be keeping my notes in the same directory as the document I’m taking notes on. From there, DEVONthink will index the directory for easy searching and organizing.

UPDATE 11/27/12
ProfHacker has graciously republished this entry on their blog on 2012-11-27.

UPDATE 11/28/12
Readers have pointed out that the hyperlinking to specific pages isn't working the way it should. The solution, near as I've been able to replicate the problem, points to just how unrobust this system is, unfortunately.

Walton Jones had to work around the problem by writing his own custom URL scheme. You may need to adopt his system to get everything working. But Skim seems to handle page numbers without any problems, at least for me. There are a few things to bear in mind when using the script: 1) The notes must appear in the same directory as the PDF, and 2) the file must match exactly the text in the note (which the script should handle for you). So, for example, example-article.pdf#page=3 must correspond exactly to example-article.pdf and reside in the same directory as the notes, otherwise it doesn't know where the file is located. Also, be sure that no spaces are included in the filename, otherwise the Markdown linking will not work.

The other area that makes the system tricky to use is the way I'm using it. When I transform the text notes into Markdown, I save the resulting markup as a PDF (either transformed in Marked.app or fed through wkpdf). The PDF file of my notes is opened in Skim, which can handle linking back to the article because all of these actions are happening within the same application. In other words, if you are planning on using the hyperlinking system as I use it, you will need to contain all activity in Skim. Otherwise, you may need to look into Walton Jones's custom URL scheme.

UPDATE 11/29/12
John Sidiropoulos, who wrote the original AppleScript to exract the colors from Skim documents adopted by Walton Jones and myself, has a brilliant post on DEVONthink and hyperlinks that nicely complements the update from yesterday.

November 24, 2012 @jaheppler