Archive for the ‘Python’ Category

I recently had to implement kMeans algorithm for clustering genes based on their profiles for one of my bioinformatics homework. Even though, I implemented my code, I wanted to compare the results using BioPython. For those, who do not know, BioPython is a set of libraries that allow you to write bioinformatics code. They have implemented most of the fundamental algorithms in bioinformatics. To me it was a great lifesaver as I can test out my ideas in few lines of Python code.

In Ubuntu, if you want to install it use the command : sudo apt-get install python-biopython .

Anyway, I was searching in net to get a code snippet to do kMeans using BioPython and somehow did not find any. So I wrote one myself using the documentation. I thought I will post code in the blog so that if anyone needs it in the future its a google search away ūüôā

Biopython’s kMeans code requires the input to be in the format accepted by Eisner’s treeview program so that needed some data massaging. The code per se is very simple. It provides the data file, number of clusters and the number of runs to try as input. Additionally it also passes an array to initialize the cluster centroids.

from Bio import Cluster
f = open("gData1.csv")
record = Cluster.Record(f)
initialId = []
numClusters = 4
numRecords = 12
for i in range(numClusters):
     for j in range(numRecords/numClusters):
        #(clusterAssignment, totalError, numPasses) = record.kcluster(nclusters=numClusters,initialid=initialId)
        (clusterAssignment, totalError, numPasses) = record.kcluster(nclusters=numClusters,npass=10)
geneNames = record.genename
g = [ [] for i in range(numClusters)]
numIndex = 0
for i in clusterAssignment :  
     numIndex += 1


The input file looks like this : It is a tab separated file with the first two special fields : geneid and gene name.

GENEID  NAME    PARAM1      PARAM2      
BLAH1    BLAH2   -0.43          0.3	


Hope the snippet is useful to some !


Read Full Post »

I am currently learning Hindi (more on that in another post). I was thinking of  ways to improve my vocabulary. I decided to write a GNOME applet, that will display Hindi words and its meaning in the panel and refresh them periodically.
I chose to use Python, partly because I was lazy and partly because the app itself is pretty trivial.

I was surprised to find that there is not much documentation to develop GNOME applets in Python. Of all the sites I looked at only one had useful information , but even that was buried in the search results . The top result seems to be this page but it is woefully old. It does give you a basic idea but lot have changed after 2004 ! (For eg gnome.applet became gnomeapplet, God Knows why).  If you prefer code , you can check here and here ,although they dont have much explanation.

GNOME applets consist of two parts. A server file and the actual python file for the applet. The server file gives basic information about your applet like its unique id , title , description , the location of the executable etc. Its pretty much simple and straightforward that you can take any sample server file such as this and then start modifying it.

Here is the server file, I had used. (I am copying all the code here as I am having problem uploading arbitrary files to WordPress.)

Some things to note are :

a) I have given the full path of the python file. There are other alternate ways too but this is the easiest.

b) The IID in the server file and the python file has to match.


<oaf_server iid="OAFIID:GNOME_HindiScroller_Factory"

        <oaf_attribute name="repo_ids" type="stringv">
                <item value="IDL:Bonobo/GenericFactory:1.0"/>
                <item value="IDL:Bonobo/Unknown:1.0"/>
        <oaf_attribute name="name" type="string" value="HindiScroller"/>
        <oaf_attribute name="description" type="string" value="Python Hindi Scroller Applet"/>

<oaf_server iid="OAFIID:GNOME_HindiScroller"

        <oaf_attribute name="repo_ids" type="stringv">
                <item value="IDL:GNOME/Vertigo/PanelAppletShell:1.0"/>
                <item value="IDL:Bonobo/Control:1.0"/>
                <item value="IDL:Bonobo/Unknown:1.0"/>
        <oaf_attribute name="name" type="string" value="HindiScroller"/>
        <oaf_attribute name="description" type="string" value="Python Hindi Scroller Applet"/>
        <oaf_attribute name="panel:category" type="string" value="Utility"/>
        <oaf_attribute name="panel:icon" type="string" value="iconimage.png"/>

The python code for the flasher is here. It is pretty simple and I have added basic comments to it.  The main reason I wanted to upload the code was to give an example of developing gnome applets using Python and OOPs. Most examples in net were doing applets without using classes. There are some basic things to note :

a)The parent class is gnomeapplet.Applet (not gnome.applet.Applet)

b) To add a timeout use gobject.timeout_add

c) The best way to debug the applet is to make it a separate window of its own. For my program adding a “-d” option makes it run in debug mode.

d) GTK label has set_markup function which allows HTML based code to be used as label text. Neat !

e) The python file has to be readable and executable by everyone.

The code per se , is very simple for the Python cognoscenti.

#!/usr/bin/env python
import pygtk

import gtk
import gnomeapplet
import gobject

import sys
import codecs
import random

class HindiScroller(gnomeapplet.Applet):
	#Reads a utf-8 file, reads all lines and returns them as a list with newlines stripped
	def readFile(self,fileName):
		f = codecs.open(fileName)
		allLines = f.readlines()
		strippedLines = [line.strip()  for line in allLines]
		return strippedLines

	#Picks a random word from the list. If its empty, it starts with original list again.
	def getNextWord(self):
		if len(self.wordsToShow) == 0:
			self.wordsToShow = self.allWords[:]
		selectedWord = random.choice(self.wordsToShow)
		english,hindi = selectedWord.split("|")
		wordInMarkups = "<b>" + english + "</b>   " + hindi
		return wordInMarkups

	#Displays the next word to GUI. Uses set_markup to use HTML
	def displayNextWord(self):
		wordToShow = self.getNextWord()
		return True

	def __init__(self,applet,iid):
		self.timeout_interval = 1000 * 10 #Timeout set to 10secs
		self.applet = applet

		#File used as source expects each line in english|hindi format
		self.fileName = "/home/neo/applet/hindidict.txt"

		self.wordsToShow = []
		self.allWords = self.readFile(self.fileName)

		wordToShow = self.getNextWord()

		self.label = gtk.Label("")

		gobject.timeout_add(self.timeout_interval, self.displayNextWord)

#Register the applet datatype

def hindi_scroller_factory(applet,iid):
	return gtk.TRUE

#Very useful if I want to debug. To run in debug mode python hindiScroller.py -d
if len(sys.argv) == 2:
	if sys.argv[1] == "-d": #Debug mode
		main_window = gtk.Window(gtk.WINDOW_TOPLEVEL)
		main_window.set_title("Python Applet")
		main_window.connect("destroy", gtk.main_quit)
		app = gnomeapplet.Applet()

#If called via gnome panel, run it in the proper way
if __name__ == '__main__':
	gnomeapplet.bonobo_factory("OAFIID:GNOME_HindiScroller_Factory", HindiScroller.__gtype__, "hello", "0", hindi_scroller_factory)

I have to say , I am pretty much exhausted trying to upload/format code in WordPress. So I am not going to write a good tutorial as I originally intended to ūüė¶ . Instead what I have done is that I have highlighted some lines which I think are important.

Some comments : Do read the files in the folder /usr/share/doc/python-gnomeapplet/examples . It has some sample files and the README file gives some nice information.

For a step by step discussion look here. Also PyGTK reference manual is here. You can also check out the GTK reference which is really good , but I had a hard time converting it to Python’s class structure.

Have fun !!

Read Full Post »