Data Crunching, Part 2


languages, database connections are managed by driver classes that, in theory, all offer the same interface. In order to use MySQL or PostgreSQL instead of SQLite, we should only have to change the import statement that loaded the driver and the connect statement that created the link. In practice, each database supports a slightly different dialect of SQL, so it's common to have to tweak our queries as well.

Once we have a connection, we create a cursor. This marks our place in the database, just as an editor cursor marks your place in a document. We can then execute the query, which we pass to the cursor as a string. Following is the fetchall call that grabs the
query's results in one go. If we were expecting a lot of data, we might page through the results instead.

Next, we process the results of the query (discussed below). Once that's done, we close the cursor and then close the connection. Forgetting to tidy up is a common mistake, and while it won't cause an immediate crash, your program may eventually fall over because it has run out of connections.

The processing inside the loop separates the variable's name from its parameters and then tries to find a matching node in the XML document. If the node exists, its parameters' values are overwritten. If it doesn't, a new node is created and appended to the document:

The full text of the functions findVarNode,
addOrReplaceParam, and
addVarNode can be found in sample code online.

It's 11:30 a.m., and the script is humming away happily in the background. Of the two-and-a-half hours it took to solve this problem, almost an hour involved tracking down information about file formats, making sure files had the right permissions, and so on; coding and testing only took about ninety minutes. Knowing how to use standard data-crunching tools--the Unix command line, regular expressions, DOM, a little SQL, and most importantly a scripting language (in this case Python)--is the reason it took so little time. We can use data-crunching tools to solve many development project problems, and investing a few hours in mastering them pays enormous dividends.

Click Here to Read
Data Crunching Part 1 of 2

About the author

AgileConnection is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.