Monthly Archives: February 2013

Shapefile to GeoJSON

A fellow attendee at the Open Data Day Toronto event hosted by Urban+Digital Toronto was wondering how to display Toronto neighbourhood data on a map that she was working on. I couldn’t find any GeoJSON of the neighbourhoods out in the open, so I grabbed a shapefile from the City of Toronto Open Data Portal – Toronto Neighborhood Planning Areas.

Toronto Neighbourhoods Shapefile

Toronto Neighbourhoods Shapefile In QGIS

To export it to GeoJSON I used the ogr2ogr tool from GDAL (Geospatial Data Abstraction Library).

$ ogr2ogr -t_srs EPSG:4269 -f geoJSON Neighbourhoods.json Neighbourhoods.shp

The resulting GeoJSON file was 1.4M in size. That’s a bit big to be sending down to the browser, so I used the Simplify geometries  tool in Quantum GIS (QGIS) and then re-exported to GeoJSON. The resulting file is a manageable 308K. It still looks good, so it could be shrunk down even further if needed.

The original shapefile, resulting simplified, GeoJSON and a Leaflet based example is up on GitHub: adamw523.github.com/toronto-geojson.

Backing Up My Gmail

I keep a lot of information in my Gmail account; in about 72,488 email messages. All of my email accounts forward to my Gmail Inbox. The convenience of having all of my email in one place is great, but I am risking things by putting all of my eggs in one basket. It’s unlikely, but Google could lose those emails, or for some reason revoke my access to the free service that I’ve been using for so long.

My current backup is still running on the cheapest instance available over at DigitalOcean:

$ fab docean status
...
Status:
backup running : yes
number of emails : 52697
size on disk : 1.2G

A few requirements I had for my Gmail backup:

  • Free (as in beer and speech)
  • Runs remotely – to not heat up my laptop, and not shutdown when I close the lid
  • Filesystem based for later statistical analysis
  • Easy to backup

In searching for a tool to host my own backup of Gmail, I cam across gmailbackup. I’ve forked it and made a small change to it’s parsing of date fields in emails. One culprit message that caused the parse error is the introductory message sent by Google entitled “Gmail is different. Here’s what you need to know.” sent when if first signed up on June 23, 2004. When presented with the “Show Original” option, it lacks the required orig-date filed from RFC2822 (Internet Message Format).

Backing up 72,000 emails takes a long time, so I’d rather not run it on my laptop. I’ve put together a fabric script to handle the major functionality of running the backup on a remote server. I originally got this working inside a Vagrant managed virtual machine, but have moved to running it remotely on a DigitalOcean VPS. Setting up a new instance and ripping it down for testing is incredibly quick and simple with them, and they’re not currently charging for backups, so it’s as close to free as can be. I’ll probably bring back the Vagrant configs shortly.

You can check out the source and instructions on how to get it running over at GitHub: https://github.com/adamw523/gmailarchive

Year of the Snake – The Python Variety

Happy Lunar New Year! I’m looking forward this year to exploring a bit more around the world of Python. Reading the excellent book Programming Collective Intelligence has inspired me to explore the language a bit more.

Python-logo-master-v3-tm-flattened

IPython has become on of my indispensable tools when working in Python. I’ve been enjoying using it’s shell for regular development, and also in the browser using the Notebook feature – A web-based user interface for authoring Python code. It’s great at displaying images and other artifacts right in the web browser.

Notebook_specgram

I’ve started a bit of my own Python development. I switched my remote server provisioning and deployment procedures to fabric.

I’ve also put together a quick project to play around with creating, building and sharing a PyPI hosted package. My first such project is a library and command line tool for interfacing with the RESTful API of the cloud hosting provider DIgitalOcean.

https://github.com/adamw523/dodo

Dodo_head_1848