the main thing remaining about the old dataset that I can see that's different from the latest one is that there is more overlap between components. Maybe there wasn't enough overlap between the components in the new datasets. If I extend the range of component 1 to [0,1], we have http://pages.cs.wisc.edu/~nathanae/hall/synthhole_lunifnomargin.png http://pages.cs.wisc.edu/~nathanae/hall/synthhole_lunifnomargin_oldh.png which looks good again. Here's another dataset http://pages.cs.wisc.edu/~nathanae/hall/synthhole_lunif_onetoten.png http://pages.cs.wisc.edu/~nathanae/hall/synthhole_lunif_onetoten_oldh.png This dataset is generated like this: for j=1, x_d~unif[1,9], for d=1..3 for j=2, x_d~unif[0,3]U[7,10], for d=1..3 Our lambda is better than theirs: l0 = 0.4051 0.5949 lstar = 0.4866 0.5134 >> sum([y==1 y==2]) ans = 495 505 Here is w: >> w0(1:10,:),wstar(1:10,:) ans = 0.1648 0.8352 0.0525 0.9475 0.1314 0.8686 0.1522 0.8478 0.0687 0.9313 0.8915 0.1085 0.0884 0.9116 0.6086 0.3914 0.0565 0.9435 0.8156 0.1844 ans = 0.5875 0.4125 0 1.0000 0 1.0000 0 1.0000 0 1.0000 1.0000 0 0 1.0000 1.0000 0 0 1.0000 1.0000 0 >> [y(1:10)==1 y(1:10)==2] ans = 0 1 0 1 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 0 As previously, our w is more confident i.e. sparse than theirs, but not always right, e.g. we assign all weight on x(3) to component 2, but really it is in component 1. x(3) is: 8.9953 7.4295 2.1911 so it actually does look like it's from component 2, though it could be from either. They also get this one wrong. I think I need to look at Benegalia et al 07 and understand exactly how their algorithm works. --------------