Marcos Placona Blog

Programming, technology and the taming of the web.

Category: Python

Python is a programming language that lets you work more quickly and integrate your systems more effectively.

Dart Pub packages stats

Reading time: 2 – 2 minutes

So modulecounts came to my attention, and I thought the idea was pretty neat.

I then contacted Erik (the author), and asked if he would mind also adding Dart.

He replied saying he wouldn’t mind, but quite rightly pointed out there seemed to be no way on Dart’s package manager to actually find out how many packages were live.

I asked some community members on G+, and while they seemed to have some solutions that kinda worked, there didn’t seem to be anything accurate or likely to work every time.

My solution was to then quickly knock-up a scraper up that would navigate through all pages on the website, grab some information and counts, aggregate and then generate a JSON output that could then be used by anyone trying to get some package information.

The JSON package is generated once a day as to not overload the website.

I then created a very simple app-engine client application that consumes the JSON packet, and shows information about it.

Check it out here: http://pub-stats.appspot.com

Source code can also be seen in my GitHub account: https://github.com/mplacona/pub-stats

I have also exposed another endpoint for anyone wanting to use the JSON packet on their applications.

The packet is also cached daily, to make sure my app-engine account doesn’t get abused :-)

You can see the JSON endpoint here: http://pub-stats.appspot.com/json

Collaboration and pull requests welcome!

Recursively delete folders with Python

Reading time: 4 – 6 minutes

Recursion

At work we’ve been doing some deploy optimization, and the need of automatically deleting (recursively) specific folders came up.
We use MXUnit to Unit Test our applications, and store all of our tests based on what they’re related (inside _test folders). So basically we end up having lots of folders in our file structure that are not supposed to go into production for security reasons.

We use SVN for development, but don’t use it on production for security reasons as well, so we always end up with a deploy package (SVN export) containing all of the files necessary for a specific release.

At the moment this package is generated, we still have our test cases in it, and it would be really painful to delete all the “_test” folders

one by one if the release is too big.

We easily end up with something like:

Messy folders

We then thought of an automated way of removing all this folders.

I first looked at a simple batch file that would “spider” the folder and delete all occurrences of an specific folder name. Not too long after I realized this is a rather complex task for someone who doesn’t really know his way around batch files.

I’ve been a Python enthusiast for a few months, so I thought I should give it a try and write something using it. The script turned out to be ridiculously simple.

I first do my necessary imports and read information from a configuration xml file:

import os
import sys
from xml.dom import minidom
emptyDirs = []
""" Read XML Settings """
doc = minidom.parse('settings.xml')
file    = doc.getElementsByTagName("files")
file    = file[0]
"""
    Set the path if it's not empty, or use current folder.
    Use sys.frozen to detect if we're coming from an executable
"""
path = file.attributes["path"].value
if path == "":
    try:
        sys.frozen
    except AttributeError:
        path = os.path.dirname(sys.argv[0])
    else:
        path = os.path.dirname(sys.executable)
# the folder name to be removed
remove = file.attributes["remove"].value

I then define the two methods responsible for deleting files and directories

""" Delete filtes inside directories prior to its deletion """
def deleteFiles(dirList, dirPath):
    for file in dirList:
        print "Deleting " + file
        os.remove(dirPath + "/" + file)
""" Delete the directory itself ""
def removeDirectory(dirEntry):
    print "Deleting files in " + dirEntry[0]
    deleteFiles(dirEntry[2], dirEntry[0])
    emptyDirs.insert(0, dirEntry[0])

The next bit of code is considered the most important, as it walks through the directories and it’s children

# Walk through the tree recursively
tree = os.walk(path)

I then loop throuh my tree results calling my methods defined above.

""" Call the functions recursively and remove files & directories """
for directory in tree:
    if remove in str(directory[0]):
        removeDirectory(directory)
for dir in emptyDirs:
    if remove in dir:
        print "Removing " + dir
        os.rmdir(dir)

I first need to make sure I’m deleting all the files inside the folders prior to the folders deletion, otherwise I’ll get errors saying the folder is not empty.
And that’s all I had to do. I then thought I’d need to have a way to distribute it, as some people at work won’t have python installed, so I didn’t wanna have them going through the hassle of installing and running files from the command line.

On my next post, I’ll be showing how to generate self-contained .exe files with py2exe.
In the meantime, you can download the source here and the binaries here.