affils1.clusters.clusters : 1696

affils2.clusters.clusters: 1897
affils2.clusters.singletons: 129

affils3.clusters.clusters: 1678 clusters (with 2283 strings)
affils3.clusters.singletons: 243 strings

Total: 12723 (includes 6935 outliers)

1st pass outliers (affils.tmp.outliers): This has further been reduced to 1662 
	clusters with 2238 strings. The remaining are singleton strings which I
	did not write it out to a file. I forgot to make a small change 
	in the code. 

Now: The total number of clusters is 12085 clusters; 
Time taken: 10 hrs. (the first pass is 4/5 hours. later on, the time 
 taken reduces drastically to much less than an hour for clustering
 each set.)

 Approach:
 The first pass is more of a data cleaning operation than clustering.
 It generates affils.tmp.clusters and affils.tmp.outliers.

 Then affils.tmp.clusters is split into three (arbitrarily chosen for 
 convenience) parts and each part is clustered separately. 
 The three parts are affils1.clusters, affils2.clusters, and affils3.clusters

 Clustering each part generates two files: affils?.clusters.clusters 
 and affils?.clusters.singletons

 In addition, we also clustered affils.tmp.outliers to generate two 
 more files, affils.outliers.clusters and affils.outliers.outliers (this
 file does not exist in this run. Please calculate this number to be
 the remaining number of strings, which were not accounted anywhere else)