Last month, I posted some analytics on my behavior over the course of 12 years of work emails that I’ve sent. At the time, I had data for just my work email and just for sent messages. I wanted to look at my personal email activity as well, but there were two things preventing me:
- I didn’t have the time.
- I wanted to do it all in Mathematica, which I am slowly teaching myself from scratch.
Well, I found a little bit of time, and I only required a little time because Paul-Jean Letourneau, a lead developer at WolframAlpha, wrote a post on how to use Mathematica to do the kind of email analytics that Stephen Wolfram posted about last month. This was great because the post contained all of the code needed to do this kind of analysis and all I need is some good code examples to quickly learn a new system. The code provided worked almost without change on my own MacBook instance of Mathematica. I had to make a few minor changes to get the mailboxes I wanted. And I had to add the following line to increase the heap space for Java:
ReinstallJava[CommandLine -> "java", JVMArguments -> "-Xmx3024m"]
Without that line, the code executed fine for sent mail, but ultimately resulted in an out-of-memory error for incoming mail.
The resulting data is a pretty good look at my personal email use over the last 7 years. We’ll start with email that I’ve sent. This goes back only to 2009 because that is when I switched from Panix to Gmail. The code looked at my Sent Mail folder in Gmail and looked for email sent from my Gmail address. I had years of imported mail from Panix, but the sent messages are from a different email address and I decided not to move things around or change the code to include these. It’s still 3 years of sent mail data which is good enough for some analysis. Here is the diurnal plot of my sent email, a total of 4,382 messages:
It’s a pretty sparse chart, not nearly as dense as my work email, but there are a few things of note. I generally don’t start sending email until just before 9am. And the volume of my email sending increased at the beginning of 2011. Also of note is the gap in late 2011; this is when I was on vacation and wasn’t doing very much email.
The picture for my incoming email is very different:
For a very long time, I used my work email for everything, but in late 2004, when I was experiencing a bit of burnout, I decided to draw a clear line between my work life and personal life. I got a personal email account and started using my work email only for work. From late 2004 to early 2009, I used Panix for my personal email and then switched to Gmail.
One thing that’s clear: I get a lot of personal email and it has only increased over time. And it seems to come in constantly. A lot of this email is things like automatic notifications of comments on the blog; or twitter messages; some of it is subscription-based. Those are pretty easy to see: those horizontal lines that run across sections of the plot early in the morning. But a fair amount is legitimate mail that I must deal with in one way or another.
I really like these diurnal plots, but another way to look at this is volume over time. Here is a plot of my incoming and outgoing email volume over time:
This data is averaged monthly. The darker plot is my sent email. Everything else is incoming mail. Clearly, the volume of my incoming mail is increasing. The amount of mail that I sent is relatively stable and that is on purpose. I used to sent lots and lots of email but I’ve tried to get that under control. It’s better for everyone.
Here is the same data for email I’ve sent, with a daily volume above the monthly averages:
And here is a similar plot for my incoming email:
Another interesting way to look at this data is to see how much email I send and receive throughout an average day, morning to night. Here my typical behavior for sending email throughout the day:
I clearly do the bulk of my email sending in the morning, and of course, there is a spike at lunchtime. And I’m pretty good about not sending email in the middle of the night.
Here is my incoming email throughout the day:
I receive a fair volume of email pretty steadily throughout the day. That spike at about 5am each morning is a daily email from Google Calendar with my agenda for the day. And that spike at 7pm is likely an email I get telling me that my cloud backups for the day have completed successfully.
I have some other plots that came from the Mathematica code, but I imagine the charts I’ve already show will test the patience of most readers and so I won’t bore you with the rest. I am still learning how to perform this analysis with Mathematica and am adding to the scripts that I have to pull in data from my FitBit pedometer, from my key logger and some other interesting sources. You can expect so see some of that data in the future once I have the scripts working properly.