Calendar Search

I need to text every student from this semester with when I'm planning on going home, and I decided to search my tutoring calendar for the names using python to get a complete list.

First Steps

Things necessary to get started, that you may do slightly differently if you're not using Google Calendar or wanting to use the python icalendar package instead.

I used Google Calendar's Settings → Import & export feature to export all my calendars, then extract from the .zip file the particular .ics file for my tutoring calendar. I renamed the file to something easier to type, tutoring.ics.

I'm using python 3.0, which mainly means you'll see print("hello") in my code instead of print "hello" when I want to run that command. I installed the python ics package to parse the file with.

Exploration

The trial-and-error before I had solid code. Skip if you just want the answer below!

I started by checking the ics page to see if it had any good documentation. It focused on creating calendars from scratch and downloading directly from the internet, neither of which I cared about, so I gave up on that.

Then I opened a python interpreter and ran:

import ics  

then typed ics. and hit the tab key twice. For me, that listed off the functions in the ics package. (I could have just as easily run help(ics) or dir(ics) to have similar but different effects.) The function ics.events_in_year looked like what I wanted, so then I ran:

help(ics.events_in_year)  

and saw that it wanted two arguments, a filename and a year, which I assumed were supposed to be a string and an int. So then I ran:

ics.events_in_year("tutoring.ics", 2017)  

which spat out an awful mess of data, so I captured that with:

data = _  

which stored that last output as data. The awful mess looked like a list because it ended with ], so I checked what the first element looked like with:

data[0]  

That looked like a dictionary since it ended with }, and had two notable entries I could spot at a glance: 'DTSTART': '20170125T210000Z\n' which
looked like the start date-time, and 'Summary': which had the name of a student of mine. So then I started making a for loop to return names with:

for datum in data:  
    print(datum["SUMMARY"])

which returned a mess of names, with a bunch of extra space because of the final '\n' newline character in each SUMMARY. I also noticed (as expected) so many duplicate names, which isn't helpful. To fix those things:

names = set() # like a list without order and no duplicates  
for datum in data:  
    name = datum["SUMMARY"][:-1] # the last part gets rid of the last character, the \n newline
    names.add(name) # add this name to the set of names

for name in names: print(name)  

That looked pretty good, but some names were old, from the spring or summer, when I only wanted the fall semester, so I started trying to select the names after August. First I ran data[0]["DTSTART"] to see what the date looked like again, then tested the slice data[0]["DTSTART"][4:4+2] to get the month (4 to skip the 4 digits of year, 2 to include the next two digits). Then I tried to run:

names = set()  
for datum in data:  
    month = int(datum["DTSTART"][3:3+2]) # casting to int just in case
    if month <= 8: continue # skip anything August or earlier
    name = datum["SUMMARY"][:-1]
    names.add(name)

for name in names: print(name)  

but an error came up! KeyError: DTSTART. Apparently some of the events don't have a start date? That didn't seem right. So I ran:

for datum in data:  
    if "DTSTART" not in datum: break

datum  

to find the first event without DTSTART and look at it. Turns out, some of the events have a DTSTART;TZID=America/Chicago entry instead, to show the time zone the date is in. That's cool; now I just have to first find a key that begins with DTSTART before I get the start date:

names = set()  
for datum in data:  
    for key in datum:
        # stop at the first key starting with DTSTART
        if key.startswith("DTSTART"): break
    month = int(datum[key][3:3+2]) # use that key, whatever it is
    if month <= 8: continue
    name = datum["SUMMARY"][:-1]
    names.add(name)

for name in names: print(name)  

Code

What works.

From within the directory with the tutoring.ics file, either open a python interpreter and run the following code, or copy-paste it in a .py file and run it. If those words make no sense, look over this tutorial.

import ics  
data = ics.events_in_year("tutoring.ics", 2017)

names = set()  
for datum in data:  
    for key in datum:
        if key.startswith("DTSTART"): break
    month = int(datum[key][4:4+2])
    if month <= 8: continue
    name = datum["SUMMARY"][:-1]
    names.add(name)

for name in sorted(names): print(name)  

Future Work

Then next time I need this script, I'd like to modify it so that it truncates everything after the second word of a name, so that the ones where I added some notation in the SUMMARY after the name don't count as separate names.

Also, instead of simply including every name that was in September or later, I'm thinking I'd like it to make a dictionary of name: month where month is the latest month where name shows up. Then I can see where it seems like the prior semester ended, and select from those the names for this semester.