Even More Fun with dapper.net

One of the challenges with getting the Voter Info Project up and running is the fact that so much of the data was in raw HTML pages, with no data feeds that could easily be imported. Remember back in the day when we wrote Perl screenscapers to deal with this? Yeah, I had no desire to go back to that. So, enter dapper.net.

With this, I was able to show it what HTML data I wanted to load in. Then, using a visual tool, I identified what elements on the page I cared about. THEN I simply labeled each field of data and grouped them together. Voila! About 10 seconds later, and I have turned straight HTML into an RSS feed, but I could have just as easily turned it into any sort of XML, RSS, Atom, Google Gadget, iCal, CSV, etc.

Just to give you an idea of what I’ve been able to do, check out these feeds of MPAA PAC contribs, MSFT PAC contribs, and RIAA PAC contribs, respectively, in the state of California.

Now that I have this data in a useful format, I can now focus on importing them into the Voter Information Project.


Tags:

0 Responses to “Even More Fun with dapper.net”


  1. No Comments

Leave a Reply

You must login to post a comment.