GSoC weekly report #1

posted on May 18, 2011

This will be my first post in a series for weekly reports on my GSoC’11 project, “Client-server model for reporting and querying package statistics”. This project aims to implement a client program to gather package information from gentoo hosts and submit them to a server, which will calculate useful statistics based on the data, rendered using a webapp.

Over the past 2-3 weeks, I have been communicating with my mentor, Alec, and have already written a proof-of-concept client and server, that is ready to be deployed and collect data, thanks to code reviews and excellent ideas from Alec.

A short summary of my progress:

  • Created project repository on git.overlays.gentoo.org

  • Read up on RESTful Web Services (from the O’reilly book)

  • ~Improved~ Tried to improve my python coding style, thanks to the excellent guide from google, suggested by Alec

  • Wrote a simple client in python to collect a few environment variables from portage, list of installed packages with useflags, encode the data in JSON, with proper authentication and issue a POST to the server

  • Wrote a simple webapp using web.py to handle requests from the above client and save the data to MySQL tables

  • Wrote some documentation to deploy the webapp

Issues encountered during development:

  • Choice of portage api vs gentoolkit api : The gentoolkit api is very easy to use but quite slow compared to the portage api. In the end, Alec asked me to use both of them as necessary, but provide an easy way to swap out one in favor of the other at a later time.

  • Choice of web framework, Turbogears vs others : I ended up using web.py for the server, rather than Turbogears (as promised in my proposal), since I found it easier to implement RESTful services using web.py rather than Turbogears. Also, web.py is lightweight, provides more control, and has enough features for implementing this webapp. However, should if I hit a snag using web.py, it probably wouldn’t take more than a day or two of coding to replace it with Turbogears.

The project repository currently provides a working client and server, and it’ll be deployed soon on soc.dev.gentoo.org, once Alec sets me up with shell access.

Plans for the upcoming weeks:

  • Get updates from the community on what data should be collected from hosts

  • Try to add more fields to the client/server and modify the SQL tables accordingly

  • Learn more about the portage api and discuss them on #gentoo-portage

My semester exams are currently in progress, and they’ll last till the end of May. So, I’ll not be able to work during these 2 weeks. However, I do look forward to get back in June, and continue with my project.