Tag Archives: annotations

How I Capture Reading Notes in Obsidian

In addition to automating my daily notes with Obsidian, it quickly became clear to me that Obsidian‘s note-linking capabilities would allow me to capture my reading notes in Obsidian in a really useful way. Moreover, because of Obsidian’s powerful linking capability, it occurred to me that my Obsidian vault could serve as a database for my reading. To describe how I managed to do this (so far) in a step-by-step manner will required a little history first.

A Brief History of My Reading List

I began keeping a list of every book I finished reading back on January 1, 1996. Although I am no longer certain of why I started keeping the list (was it part of a New Year’s resolution?) I am fairly certain that I was influenced by an early reading list I found on the Internet, Eric W. Leuliette’s “What I Have Read Since 1974“.

As a developer (even back then), I decided I would build an elaborate relational database to store my reading list. Over the years, it went through many iterations, and forms. When time became short, I moved the list out of the database and into Excel, or Google Sheets. Finally, several years ago, I settled on a plain text file using Markdown format, and that is how I’ve kept my list ever since.

But I’ve been bothered by shortcomings on this list. There are redundancies I don’t like about it. I have no easy way of referring to books or authors separate from the list. There are things I’d like to automate about it but that the format makes tricky.

A Brief History of My Reading Notes

With all of the reading I do, I have trouble remembering important details of what I read about. So I started keeping notes on my reading. This evolved out of how I kept notes on my reading back in college, and has continued to evolve over the decades since. It was in college that I first decided it was okay for me to write in my books. After all, if I was spending so much money on them, I might as well make them my own, right?

These days, I highlight books, writing margins, and with e-books, I highlight and make short notes on my Kindle devices and apps. But I still have no good way of aggregating these notes into useful groups, categories, and certainly no way of readily searching them.

As I started using Obsidian, and began to see how I could better organize my books and reading lists in its vault structure, I began to get a hint of ways that I might start to link my reading notes back to the books they are associated with, my reading, and other notes.

Enter Zettelkasten

I’d never heard of Zettelkasten before I started using Obsidian. Zettelkasten was originally invented as a way to link paper notes together to be able to easily create connections (links) between then. While it was workable on paper, such a process could be greatly improved with hypertext tools, and it so happens that Obsidian’s note-linking capability is idea for this.

One important idea from Zettelkasten is that a note should contain a single thought or piece of information (say, a passage highlighted in a book). That note is given a unique identifier. In addition to the passage, one would add their own thoughts to the note, and perhaps further link that note to other notes and ideas that are related to it. Zettelkasten has its own unique numbering system for “naming” the notes. Obsidian has a plug-in for creating a “Zettelkasten number” for this purpose that is based on the date/time the note is created. I wasn’t particularly fond of that identifier because it duplicates information already contained in the note itself. After all, the note is just a file in the file system, and has its own create and modified date/times as part of the file. A good identifier does’t embed real data. It’s just an identifier.

I also struggled a bit to figure out how this would work for my reading notes. I originally imagined that if I had a note for each book I read, I could simply add my highlights and annotations to that note. Zettelkasten, however, suggested that rather than adding that highlight to the book note, I’d create a separate note for just the highlight or annotation, and then link it to the book note–as well any other notes it might make sense to link it to. This took a while for me to process, and I thought about it a lot as I built out my reading library in Obsidian.

My Obsidian Library

So how did I decided to structure my reading notes in Obsidian? I’ll try to go through the step-by-step process I have for putting this all together, in case someone is interested in reproducing this.

Step 1: Establishing the structure

I decided that because of Obsidian’s great linking capability, I could use the file system itself as a relational database. In deciding this, I further decided that there were 3 main “objects” I wanted to be able to capture at a kind of atomic level. That is, three things that make up the structure of my reading library:

  1. Things I read, e.g., books, articles, stories, etc.
  2. Authors: the people who write the things in #1.
  3. My notes as they relate to #1 and #2.

From this, I established the following structure of folders in within my Obsidian vault:

My folder structure for Reading notes in Obsidian.
  • Commonplace Book contains all of my reading notes.
  • Library contains all of the “atomic” notes that make up my reading library:
    • Authors: a single note for each unique author in my library
    • Articles: a single note for each unique article (often not tied to a book) in my library.
    • Book: a single note for each unique book in my library
    • Essays: a single note for each unique essay in my library; these are often related to books.
    • Stories: a single note for each unique story in my library.

Step 2: Deciding what goes into a note

Once I had my structure, I had to decide what goes into a note of each type. What is it I want to know about authors, books, stories, etc.? This was fairly easy for me as I’ve been thinking about it for a long time (years, actually). I had in mind an idea that I could write an API that uses these files as a database to query them and produce results. With that in mind, I decided to start by keeping things simple, knowing that I could add detail as needed going forward.

For authors, I wanted just some basic information. Here is a typical author note, in this case, for Alan Lightman, whose new book I read earlier this week:

A sample author note for Alan Lightman.

The backlinks section is generated automatically by a script that I have that runs nightly. I know that I could just click on the “Linked mentions” in Obsidian to see all of the backlinks, but I wanted the related books on the note as a reference in case I access the file outside of Obsidian.

For books (or essays, stories, articles), I also kept things simple. A typical book (or essay, or article, or story) looks like this:

A sample title note for In Praise of Wasting Time

Note that in both authors and books there are links back and forth between the files. The book file refers to the author. The author file has link references back to the books. Moreover, you’ll note that in the book, there is an “Annotations” section with a list of links. These are auto-generated links to my notes and highlights for the book. I’ll have more to say on these shortly, but the important thing is that each note and highlight is a separate file (in the Zettelkasten vain) and is included with the book as a “transclusion” link, meaning that when I view the note in preview mode, it “includes” the links files as part of the note, like this:

Title note in preview mode with transcluded annotations visible.

Step 3. Populating the database

Once I had the structure I wanted, I needed to populate my database. I was fortunate in this regard on 2 counts: (1) I happened to recently create a SQLite database of my books, and (2) I can write code relatively easily. I wrote a script that crawled my book database, and from it, creating the notes for books and authors in Obsidian. This turned out to be a surprisingly simple exercise. (The Python script was 130 lines.)

My digital commonplace book

I first learned of commonplace books reading a biography of Thomas Jefferson (in this case, it was Williard Sterne Randall’s Thomas Jefferson: A Life.) Jefferson (and others in his time) would copy passages from their reading into a book. This helped with memorization, but it also provided a resource where they could add notes and observations. I’ve always liked this concept, and I decided that Obsidian would finally allow me to put it into action in a way I’d envisioned.

It is trivial to create a note and add it to the note containing the book to which it is related. But what if the note ultimately relates to more than one thing? Reading about Zettelkaten provided me with insights into how I might handle this. The naming convention in Zettelkasten (and the way it is implemented in Obsidian) bothered me. Neither made much sense. How do you search for things with essentially coded filenames?

I was in the shower when I finally had a breakthrough insight on this. I’m not searching for a filename, I’m searching for file content. If each annotation and highlight I can link it to as many notes as makes sense. Furthermore, I can add tags to each note. The name of the file doesn’t matter. What matter is how it links to other notes, and that all files are searchable.

I still didn’t like the file-naming scheme for Zettelkasten in Obsidian, which essentially uses a datetime stamp down to the current second. So a file might be named: 20210215084456. Given that one is not likely to create two of these notes within the same second, it guarantees uniqueness. But from a database perspective, identifiers like these are not supposed to embed any information. They should be strictly identifiers. Moreover, the with the date embedded in the note title, I would be duplicating information that already exists in the file properties.

I decided instead to use a Guid, or what is sometimes called a UUID. This is another form of a unique identifier that doesn’t embed information, just produces a unique code. (For those tech-savvy folks reading this, I used Python’s UUID4 which doesn’t use the MAC address as part of the identifier.)

When I have a new note or highlight for a book, it goes into my Commonplace Book folder in Obsidian. These notes also have a specific structure. A typical one looks like this:

A typical note, Zettelkasten style.

Each annotation begins with a Source that links back to the source for that annotation. It may or may not have tags associated with it. That is followed by the body of the annotation, which may be a highlighted passage. Finally, there are my own notes related to the specific passage. In the above example, my notes also link to another book, making this particular annotation related to more than one note. That is, a link has been created between Creativity, Inc by Ed Catmull and Amy Wallace, and On Writing by Stephen King.

Automating my annotations

Over the weekend, I got a start on automating these annotations. I wrote a Python script that reads a CSV files exported from Kindle, and creates a unique note for each annotation in the file, relating it back to the source book in my Library. My process is roughly this (I say roughly because this is still new):

  1. When I finish reading a book, I export the annotations from my Kindle, which sends me an email. That email has a CSV attachment which I save in a folder.
  2. A script runs, and processes and CSV files I have in the folder, creating the notes and links.
  3. The script, outputs a list resulting annotations for each file. I copy this and paste it into the “Annotations” section of the source book or article. That makes it easy to view the annotations inline when previewing the note. An example of the output from the script looks like this:
Output from my annotation import script.

Toward an API for my books and annotations

I am able to do the above automation because I have a standardized structure to my books and author notes. That standardization allowed me to write an API for my book library. From this API I can, for instance, check to see if a title exists in my library already. I can grab information about a book or author and then use it in some way. The API typically returns data in JSON format. For instance, if I call the function biblio.search_by_title("Beyond"), I get a JSON formatted return containing the following:

[
   {
      "title":"_Beyond Band of Brothers: The War Memoirs of Major Dick Winters_",
      "link":"[[Beyond Band of Brothers (334)]]",
      "type":"book",
      "authors":[
         {
            "author":"Winters, Richard",
            "authorFirstLast":"Richard Winters",
            "authorLink":"[[Winters, Richard]]",
            "gender":"None"
         }
      ],
      "source":"",
      "date":""
   },
   {
      "title":"_Beyond Apollo_",
      "link":"[[Beyond Apollo (58)]]",
      "type":"book",
      "authors":[
         {
            "author":"Malzberg, Barry N",
            "authorFirstLast":"Barry N Malzberg",
            "authorLink":"[[Malzberg, Barry N]]",
            "gender":"m",
            "alternateNames":[
               {
                  "name":"Barry, Mike",
                  "nameLink":"[[Barry, Mike]]"
               }
            ]
         }
      ],
      "source":"",
      "date":""
   },
   {
      "title":"_Beyond the Blue Event Horizon_",
      "link":"[[Beyond the Blue Event Horizon (259)]]",
      "type":"book",
      "authors":[
         {
            "author":"Pohl, Frederik",
            "authorFirstLast":"Frederik Pohl",
            "authorLink":"[[Pohl, Frederik]]",
            "gender":"None"
         }
      ],
      "source":"",
      "date":""
   }
]

The results so far

I’ve linked all of this together using my master reading list note. This note contains a list of everything I have read since 1996 and serves as a kind of index to my reading:

A sample from my reading list index note.

A big part of the way Obsidian works is that it can show you the relationships between your notes. While I am still working on importing all of the notes I have in my Kindle, I can already see a a network of relationships when I view the graph of my Obsidian vault:

A graph of the relationships between all of my notes.

Most of my notes are book and reading-related at this point. That big dot in the center is the master reading list illustrated above. If I highlight it, this is what I see:

Sample of a highlighted node on the graph.

From there, you can see other nodes and relationships that have started to form. For instance, if I hover over one of the Alan Lightman books I finished yesterday, In Praise of Wasting Time, you can see a little network of links coming off that book:

Some of those links point to annotation files. Another points back to the note for Alan Lightman. And a few of the annotation links point to seemingly unrelated notes.

Here is another example. One of the big nodes is for John W. Campbell, editor of Astounding Science Fiction in the late 1930s through his head in the early 1970s. I read many of those old issue when I was taking my Vacation in the Golden Age of Science Fiction. So Campbell shows up a lot on my master reading list:

Highlighting an author node on the graph.

You can see that Campbell is linked to all of the issues of Astounding that I have read. I have started to bring my notes in for those issues. If we look at the July 1939 issue, for instance, you can see this is related to all of the stories and articles and authors in that issue:

Currently, the notes for each story are part of the story note, but I plan on breaking those out into their own Zettelkasten-style notes as I’ve done for my other notes and annotations.

Conclusions

Keep in mind, that this is all being done with plain text files, something that I like because the format is compatible virtually anywhere. This could be done as easily on a Windows machine as a Mac. It could be done easily on a Linux machine. The openness and longevity of plain text (which has been around for fifty years now) is a big part of what I like about this system.

The linking that Obsidian provides from within its application makes all of this useful. But once established, those links are just as useful outside Obsidian with a little coding–as I’ve done with my API for books and authors. And this API is extensible. This week, I plan to add capability for the API to return any annotations when returning a “book” object. So in addition to what is returned by the JSON format illustrated above, that will soon contain a node for annotations related to that book.

Mostly, I am satisfied that I now have a simple way of keeping my reading notes in a useful form. These are easily searchable, they are easily linked. I can continue to capture highlights and brief notes as a I read. The import function allows a nice step to expand on my annotations as I review them after they’ve been pulled into Obsidian.

It did take me some time to get the infrastructure in place, but now that it is there, I am able to focus on reading, notes, and let the system organize them for me.

A Digital Commonplace Book Protocol for Internet Annotations

Any time I find myself thinking how great the Internet is, my thoughts drift to the one area in which I find it sorely lacking: a native annotation capability. The notion of hypertext and the linking of documents in a digital network is genius. It seems to me that if you can conceive of this, you have to understand that it works well only when you are talking about large volumes of “pages” or documents. And if you imagine large volumes of documents and the rabbit hole the links lead you down, you’ve got to wonder how we missed the native ability to annotate these documents as we go.

There are third-party tools that help with this. These tools can clip articles and store the clipped versions locally. Some of them provide tools for marking up the articles–highlighting, or adding your own annotations. But these often feel flimsy to me. It seems to me that what we need, is foundational tool for annotating what we read on the Internet. And while this does’t seem to be an operating system level function, it certainly seems to be something that a well-designed web browser should be able to do as part of its basic functionality.

Requirement for a Digital Commonplace Book protocol

The Internet is full of standards and protocols, and if I were designing web browsers, I’d see if I could come up with a standard for what I’d call a “digital commonplace book” protocol–DCB for short, since above all, a protocol must have an abbreviation. My DCB protocol would differ in some ways from tools like Pocket or Evernote. My list of requirements for a DCB would look as follows:

  • HTML protocols would be extended to support DCB. There is some additional metadata that would be needed to support annotations, like versioning, and being able to locate specific pieces of text or objects on the page.
  • Highlights and annotations could be stored locally or in a service (like Evernote or Pocket)
  • In addition to the highlights and annotation data being stored, information about the specific page and page version would need to be stored as well.
  • When a browser renders a page, it would look at the digital commonplace book to see if that page had entries, and apply CSS to the page to show the highlights and annotations when viewing the page.
  • If the page had changed since the annotations were captured, the meta-data collected as part of the annotation could provide a reference to the page version it was captured on, and the browser would have a kind of timeline slider for the page that allows you to scroll back in time to see the pages as it was at the time you captured your annotations.
  • On the server side, it might be useful to log how many times (and what parts) of a page have been annotated. These would all be anonymous logs. The site owner would not know who annotated the pages, just that (a) there was an annotation, and it which part was annotated. This would be optional at the site, or even page level.
  • The annotations features would be built into the browser, and the storage format would be standardized, and compatible with any browser.
  • Browsers would provide mechanisms for tagging, updating, viewing, searching, and exporting annotations.
  • The format of the annotations would be an open format readily accessible to other applications and protocols. JSON might be a good start.

The bottom line for me is that if I highlighted some text on a page in a browser, and made some notes about it, and then came back to that page a few days later, the browser would show my highlights and annotations inline as I viewed the page.

Making Kindle annotations compatible with the DCB protocol

The other major source of annotations I make are in the books, magazines and newspapers I read. And while Kindle provides a useful mechanism for highlights an annotating passages, it’s a lot easier to the get the data in than to get the data out.

I would make Kindle apps and devices compatible with the DCB protocol. I imagine this wouldn’t be too difficult, considering that the annotation functionality is already there in the devices and apps, and would just require some tweaking to make it compatible with such a protocol. The part I would spend a lot of time on is making sure it is as easy to get my annotations out as it is to get them in.


I was thinking about this because I am getting ready to write my next post on how I’m using Obsidian to catalog my reading and reading-related notes. I read in all kinds of mediums. I read web pages, articles in apps, on the Kindle, via Audible audiobooks, and of course on good old-fashioned paper.

For me, all books are interactive. I converse with the authors in the margins. I highlight passages and come back later and make notes on why I highlighted them. Long gone are the days when I revered the pristine look of the printed page, over the page that I have made my own. I encourage my kids to do the same.

In thinking about how I’ve organized my reading notes in Obsidian, it occurred to me my methods could be greatly improved if there was a standardized protocol for a digital commonplace book. Alas, one doesn’t exist at this point.

But at least you now have the context for why I organize my reading notes the way I do–but I’m getting ahead of myself here. That will have to wait for another post.