ó
Ê½÷Xc           @` s`  d  Z  d d l m Z m Z m Z d Z d d d d g Z d d l Z d d l Z	 d d	 l
 m Z d d
 l m Z d d l m Z d e f d „  ƒ  YZ e d „ Z e d „ Z e d „ Z d „  Z e d „ Z d d „ Z d d e d „ Z d „  Z d „  Z i e d 6e d 6Z d „  Z d „  Z i e d 6e d 6Z d  d d d e d! „ Z  d" „  Z! d S(#   s0  
====================================================================
K-means clustering and vector quantization (:mod:`scipy.cluster.vq`)
====================================================================

Provides routines for k-means clustering, generating code books
from k-means models, and quantizing vectors by comparing them with
centroids in a code book.

.. autosummary::
   :toctree: generated/

   whiten -- Normalize a group of observations so each feature has unit variance
   vq -- Calculate code book membership of a set of observation vectors
   kmeans -- Performs k-means on a set of observation vectors forming k clusters
   kmeans2 -- A different implementation of k-means with more methods
           -- for initializing centroids

Background information
======================
The k-means algorithm takes as input the number of clusters to
generate, k, and a set of observation vectors to cluster.  It
returns a set of centroids, one for each of the k clusters.  An
observation vector is classified with the cluster number or
centroid index of the centroid closest to it.

A vector v belongs to cluster i if it is closer to centroid i than
any other centroids. If v belongs to i, we say centroid i is the
dominating centroid of v. The k-means algorithm tries to
minimize distortion, which is defined as the sum of the squared distances
between each observation vector and its dominating centroid.  Each
step of the k-means algorithm refines the choices of centroids to
reduce distortion. The change in distortion is used as a
stopping criterion: when the change is lower than a threshold, the
k-means algorithm is not making sufficient progress and
terminates. One can also define a maximum number of iterations.

Since vector quantization is a natural application for k-means,
information theory terminology is often used.  The centroid index
or cluster index is also referred to as a "code" and the table
mapping codes to centroids and vice versa is often referred as a
"code book". The result of k-means, a set of centroids, can be
used to quantize vectors. Quantization aims to find an encoding of
vectors that reduces the expected distortion.

All routines expect obs to be a M by N array where the rows are
the observation vectors. The codebook is a k by N array where the
i'th row is the centroid of code word i. The observation vectors
and centroids have the same feature dimension.

As an example, suppose we wish to compress a 24-bit color image
(each pixel is represented by one byte for red, one for blue, and
one for green) before sending it over the web.  By using a smaller
8-bit encoding, we can reduce the amount of data by two
thirds. Ideally, the colors for each of the 256 possible 8-bit
encoding values should be chosen to minimize distortion of the
color. Running k-means with k=256 generates a code book of 256
codes, which fills up all possible 8-bit sequences.  Instead of
sending a 3-byte value for each pixel, the 8-bit centroid index
(or code word) of the dominating centroid is transmitted. The code
book is also sent over the wire so each 8-bit code can be
translated back to a 24-bit pixel value representation. If the
image of interest was of an ocean, we would expect many 24-bit
blues to be represented by 8-bit codes. If it was an image of a
human face, more flesh tone colors would be represented in the
code book.

i    (   t   divisiont   print_functiont   absolute_importt   restructuredtextt   whitent   vqt   kmeanst   kmeans2N(   t   _asarray_validated(   t   _numpy_compati   (   t   _vqt   ClusterErrorc           B` s   e  Z RS(    (   t   __name__t
   __module__(    (    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyR   [   s   c         C` sd   t  |  d | ƒ}  t j |  d d ƒ} | d k } | j ƒ  r\ d | | <t j d t ƒ n  |  | S(   sà  
    Normalize a group of observations on a per feature basis.

    Before running k-means, it is beneficial to rescale each feature
    dimension of the observation set with whitening. Each feature is
    divided by its standard deviation across all observations to give
    it unit variance.

    Parameters
    ----------
    obs : ndarray
        Each row of the array is an observation.  The
        columns are the features seen during each observation.

        >>> #         f0    f1    f2
        >>> obs = [[  1.,   1.,   1.],  #o0
        ...        [  2.,   2.,   2.],  #o1
        ...        [  3.,   3.,   3.],  #o2
        ...        [  4.,   4.,   4.]]  #o3

    check_finite : bool, optional
        Whether to check that the input matrices contain only finite numbers.
        Disabling may give a performance gain, but may result in problems
        (crashes, non-termination) if the inputs do contain infinities or NaNs.
        Default: True

    Returns
    -------
    result : ndarray
        Contains the values in `obs` scaled by the standard deviation
        of each column.

    Examples
    --------
    >>> from scipy.cluster.vq import whiten
    >>> features  = np.array([[1.9, 2.3, 1.7],
    ...                       [1.5, 2.5, 2.2],
    ...                       [0.8, 0.6, 1.7,]])
    >>> whiten(features)
    array([[ 4.17944278,  2.69811351,  7.21248917],
           [ 3.29956009,  2.93273208,  9.33380951],
           [ 1.75976538,  0.7038557 ,  7.21248917]])

    t   check_finitet   axisi    g      ð?sW   Some columns have standard deviation zero. The values of these columns will not change.(   R   t   npt   stdt   anyt   warningst   warnt   RuntimeWarning(   t   obsR   t   std_devt   zero_std_mask(    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyR   _   s    -
	
c         C` s²   t  |  d | ƒ}  t  | d | ƒ} t j |  | ƒ } |  j | d t ƒ} | j | k rl | j | ƒ } n | } | t j t j f k rŸ t j	 | | ƒ } n t
 |  | ƒ } | S(   sj  
    Assign codes from a code book to observations.

    Assigns a code from a code book to each observation. Each
    observation vector in the 'M' by 'N' `obs` array is compared with the
    centroids in the code book and assigned the code of the closest
    centroid.

    The features in `obs` should have unit variance, which can be
    achieved by passing them through the whiten function.  The code
    book can be created with the k-means algorithm or a different
    encoding algorithm.

    Parameters
    ----------
    obs : ndarray
        Each row of the 'M' x 'N' array is an observation.  The columns are
        the "features" seen during each observation. The features must be
        whitened first using the whiten function or something equivalent.
    code_book : ndarray
        The code book is usually generated using the k-means algorithm.
        Each row of the array holds a different code, and the columns are
        the features of the code.

         >>> #              f0    f1    f2   f3
         >>> code_book = [
         ...             [  1.,   2.,   3.,   4.],  #c0
         ...             [  1.,   2.,   3.,   4.],  #c1
         ...             [  1.,   2.,   3.,   4.]]  #c2

    check_finite : bool, optional
        Whether to check that the input matrices contain only finite numbers.
        Disabling may give a performance gain, but may result in problems
        (crashes, non-termination) if the inputs do contain infinities or NaNs.
        Default: True

    Returns
    -------
    code : ndarray
        A length M array holding the code book index for each observation.
    dist : ndarray
        The distortion (distance) between the observation and its nearest
        code.

    Examples
    --------
    >>> from numpy import array
    >>> from scipy.cluster.vq import vq
    >>> code_book = array([[1.,1.,1.],
    ...                    [2.,2.,2.]])
    >>> features  = array([[  1.9,2.3,1.7],
    ...                    [  1.5,2.5,2.2],
    ...                    [  0.8,0.6,1.7]])
    >>> vq(features,code_book)
    (array([1, 1, 0],'i'), array([ 0.43588989,  0.73484692,  0.83066239]))

    R   t   copy(   R   R   t   common_typet   astypet   Falset   dtypet   float32t   float64R
   R   t   py_vq(   R   t	   code_bookR   t   ctt   c_obst   c_code_bookt   results(    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyR   —   s    :c   	      C` s|  t  |  d | ƒ}  t  | d | ƒ} t j |  ƒ d k rv t j |  ƒ t j | ƒ k sf t d ƒ ‚ q‹ t |  | ƒ Sn t j |  ƒ \ } } t j |  ƒ t j | ƒ k s¸ t d ƒ ‚ n3 | | j d k së t d | j d | f ƒ ‚ n  t j | d t ƒ} t j | ƒ } xW t | ƒ D]I } t j	 |  | | d d ƒ } t j
 | ƒ | | <| | | | | <qW| t j | ƒ f S(   sÎ   Python version of vq algorithm.

    The algorithm computes the euclidian distance between each
    observation and every frame in the code_book.

    Parameters
    ----------
    obs : ndarray
        Expects a rank 2 array. Each row is one observation.
    code_book : ndarray
        Code book to use. Same format than obs. Should have same number of
        features (eg columns) than obs.
    check_finite : bool, optional
        Whether to check that the input matrices contain only finite numbers.
        Disabling may give a performance gain, but may result in problems
        (crashes, non-termination) if the inputs do contain infinities or NaNs.
        Default: True

    Returns
    -------
    code : ndarray
        code[i] gives the label of the ith obversation, that its code is
        code_book[code[i]].
    mind_dist : ndarray
        min_dist[i] gives the distance between the ith observation and its
        corresponding code.

    Notes
    -----
    This function is slower than the C version but works for
    all input types.  If the inputs have the wrong types for the
    C versions of the function, this one is called as a last resort.

    It is about 20 times slower than the C version.

    R   i   s3   Observation and code_book should have the same ranksN   Code book(%d) and obs(%d) should have the same number of features (eg columns)R   i   (   R   R   t   ndimt
   ValueErrort	   _py_vq_1dt   shapet   zerost   intt   ranget   sumt   argmint   sqrt(	   R   R!   R   t   nt   dt   codet   min_distt   it   dist(    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyR    ã   s(    %c         C` s§   t  d ƒ ‚ |  j } | j } t j | | f ƒ } x; t | ƒ D]- } t j |  | | ƒ | d d … | f <q@ Wt | ƒ t j | ƒ } | | } | t j | ƒ f S(   s   Python version of vq algorithm for rank 1 only.

    Parameters
    ----------
    obs : ndarray
        Expects a rank 1 array. Each item is one observation.
    code_book : ndarray
        Code book to use. Same format than obs. Should rank 1 too.

    Returns
    -------
    code : ndarray
        code[i] gives the label of the ith obversation, that its code is
        code_book[code[i]].
    mind_dist : ndarray
        min_dist[i] gives the distance between the ith observation and its
        corresponding code.

    s1   _py_vq_1d buggy, do not use rank 1 arrays for nowN(	   t   RuntimeErrort   sizeR   R*   R,   R-   t   printR.   R/   (   R   R!   R0   t   ncR5   R4   R2   R3   (    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyR(   )  s    		+

c         C` sü   t  |  d | ƒ}  t  | d | ƒ} t j |  ƒ d } | | j d k sj t d | j d | f ƒ ‚ n  |  t j d d … d d … f | d d … t j d d … f } t j t j | | d ƒ ƒ } t j | d ƒ } t j j	 | d ƒ } | | f S(   sÈ  2nd Python version of vq algorithm.

    The algorithm simply computes the euclidian distance between each
    observation and every frame in the code_book/

    Parameters
    ----------
    obs : ndarray
        Expect a rank 2 array. Each row is one observation.
    code_book : ndarray
        Code book to use. Same format than obs. Should have same number of
        features (eg columns) than obs.
    check_finite : bool, optional
        Whether to check that the input matrices contain only finite numbers.
        Disabling may give a performance gain, but may result in problems
        (crashes, non-termination) if the inputs do contain infinities or NaNs.
        Default: True

    Returns
    -------
    code : ndarray
        code[i] gives the label of the ith obversation, that its code is
        code_book[code[i]].
    mind_dist : ndarray
        min_dist[i] gives the distance between the ith observation and its
        corresponding code.

    Notes
    -----
    This could be faster when number of codebooks is small, but it
    becomes a real memory hog when codebook is large. It requires
    N by M by O storage where N=number of obs, M = number of
    features, and O = number of codes.

    R   i   sg   
            code book(%d) and obs(%d) should have the same
            number of features (eg columns)Niÿÿÿÿi    (
   R   R   R)   R'   t   newaxisR/   R-   R.   t   minimumt   reduce(   R   R!   R   R1   t   diffR5   R2   R3   (    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyt   py_vq2J  s    $Bgñhãˆµøä>c   
      C` sé   t  j | d t ƒ} g  } t  j } x´ | | k rÚ | j d } t |  | ƒ \ } } | j t  j | d d ƒƒ | | k r° t j	 |  | | ƒ \ } }	 | j
 |	 d d ƒ} n  t | ƒ d k r' | d | d } q' q' W| | d f S(   sM   "raw" version of k-means.

    Returns
    -------
    code_book
        the lowest distortion codebook found.
    avg_dist
        the average distance a observation is from a code in the book.
        Lower means the code_book matches the data better.

    See Also
    --------
    kmeans : wrapper around k-means

    Examples
    --------
    Note: not whitened in this example.

    >>> from numpy import array
    >>> from scipy.cluster.vq import _kmeans
    >>> features  = array([[ 1.9,2.3],
    ...                    [ 1.5,2.5],
    ...                    [ 0.8,0.6],
    ...                    [ 0.4,1.8],
    ...                    [ 1.0,1.0]])
    >>> book = array((features[0],features[2]))
    >>> _kmeans(features,book)
    (array([[ 1.7       ,  2.4       ],
           [ 0.73333333,  1.13333333]]), 0.40563916697728591)

    R   i    R   iÿÿÿÿi   iþÿÿÿ(   R   t   arrayt   Truet   infR)   R   t   appendt   meanR
   t   update_cluster_meanst   compresst   len(
   R   t   guesst   threshR!   t   avg_distR=   R9   t   obs_codet   distortt   has_members(    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyt   _kmeans‚  s    !	i   c         C` sØ  t  |  d | ƒ}  t | ƒ d k  r3 t d ƒ ‚ n  d
 } d
 } y t | ƒ } Wn# t k
 rt t  | d | ƒ} n X| d
 k	 r» | j d k  r£ t d | ƒ ‚ n  t |  | d | ƒ} n| | k rÖ t d ƒ ‚ n  t j } |  j	 d }	 | } | d k  rt d ƒ ‚ n  x¸ t
 | ƒ D]ª }
 t j j d |	 | ƒ } t j t j | d	 t ƒd d k ƒ ryt j j |	 ƒ |  } n  t j |  | d ƒ } t |  | d | ƒ\ } } | | k  r| } | } qqW| | f } | S(   s  
    Performs k-means on a set of observation vectors forming k clusters.

    The k-means algorithm adjusts the centroids until sufficient
    progress cannot be made, i.e. the change in distortion since
    the last iteration is less than some threshold. This yields
    a code book mapping centroids to codes and vice versa.

    Distortion is defined as the sum of the squared differences
    between the observations and the corresponding centroid.

    Parameters
    ----------
    obs : ndarray
       Each row of the M by N array is an observation vector. The
       columns are the features seen during each observation.
       The features must be whitened first with the `whiten` function.

    k_or_guess : int or ndarray
       The number of centroids to generate. A code is assigned to
       each centroid, which is also the row index of the centroid
       in the code_book matrix generated.

       The initial k centroids are chosen by randomly selecting
       observations from the observation matrix. Alternatively,
       passing a k by N array specifies the initial k centroids.

    iter : int, optional
       The number of times to run k-means, returning the codebook
       with the lowest distortion. This argument is ignored if
       initial centroids are specified with an array for the
       ``k_or_guess`` parameter. This parameter does not represent the
       number of iterations of the k-means algorithm.

    thresh : float, optional
       Terminates the k-means algorithm if the change in
       distortion since the last k-means iteration is less than
       or equal to thresh.

    check_finite : bool, optional
        Whether to check that the input matrices contain only finite numbers.
        Disabling may give a performance gain, but may result in problems
        (crashes, non-termination) if the inputs do contain infinities or NaNs.
        Default: True

    Returns
    -------
    codebook : ndarray
       A k by N array of k centroids. The i'th centroid
       codebook[i] is represented with the code i. The centroids
       and codes generated represent the lowest distortion seen,
       not necessarily the globally minimal distortion.

    distortion : float
       The distortion between the observations passed and the
       centroids generated.

    See Also
    --------
    kmeans2 : a different implementation of k-means clustering
       with more methods for generating initial centroids but without
       using a distortion change threshold as a stopping criterion.

    whiten : must be called prior to passing an observation matrix
       to kmeans.

    Examples
    --------
    >>> from numpy import array
    >>> from scipy.cluster.vq import vq, kmeans, whiten
    >>> features  = array([[ 1.9,2.3],
    ...                    [ 1.5,2.5],
    ...                    [ 0.8,0.6],
    ...                    [ 0.4,1.8],
    ...                    [ 0.1,0.1],
    ...                    [ 0.2,1.8],
    ...                    [ 2.0,0.5],
    ...                    [ 0.3,1.5],
    ...                    [ 1.0,1.0]])
    >>> whitened = whiten(features)
    >>> book = np.array((whitened[0],whitened[2]))
    >>> kmeans(whitened,book)
    (array([[ 2.3110306 ,  2.86287398],    # random
           [ 0.93218041,  1.24398691]]), 0.85684700941625547)

    >>> from numpy import random
    >>> random.seed((1000,2000))
    >>> codes = 3
    >>> kmeans(whitened,codes)
    (array([[ 2.3110306 ,  2.86287398],    # random
           [ 1.32544402,  0.65607529],
           [ 0.40782893,  2.02786907]]), 0.5196582527686241)

    R   i   s   iter must be at least 1.s)   Asked for 0 cluster ? initial book was %sRH   s0   if k_or_guess is a scalar, it must be an integeri    s   Asked for 0 cluster ? t   return_countsN(   R   R+   R'   t   Nonet	   TypeErrorR7   RM   R   RA   R)   R,   t   randomt   randintR   R	   t   uniqueR@   t   permutationt   take(   R   t
   k_or_guesst   iterRH   R   t   kRG   t   resultt	   best_distt   NoR4   t   k_random_indicest   bookR5   t	   best_book(    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyR   µ  sB    _	c         C` s^   |  j  d k r |  j d } n	 |  j } t j j | ƒ } |  | |  d d … f j ƒ  } | S(   sÑ  Pick k points at random in data (one row = one observation).

    This is done by taking the k first values of a random permutation of 1..N
    where N is the number of observation.

    Parameters
    ----------
    data : ndarray
        Expect a rank 1 or 2 array. Rank 1 are assumed to describe one
        dimensional data, rank 2 multidimensional data, in which case one
        row is one observation.
    k : int
        Number of samples to generate.

    i   i    N(   R&   R)   R7   R   RQ   RT   R   (   t   dataRX   R0   t   pt   x(    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyt   _kpointsA  s    	 c         ` s„   ‡  f d †  } ‡  f d †  } ‡  f d †  } t  j |  ƒ } | d k rR | |  ƒ S|  j d |  j d k rv | |  ƒ S| |  ƒ Sd S(   sú  Returns k samples of a random variable which parameters depend on data.

    More precisely, it returns k observations sampled from a Gaussian random
    variable which mean and covariances are the one estimated from data.

    Parameters
    ----------
    data : ndarray
        Expect a rank 1 or 2 array. Rank 1 are assumed to describe one
        dimensional data, rank 2 multidimensional data, in which case one
        row is one observation.
    k : int
        Number of samples to generate.

    c         ` sQ   t  j |  ƒ } t  j |  ƒ } t  j j ˆ  ƒ } | t  j | ƒ 9} | | 7} | S(   N(   R   RC   t   covRQ   t   randnR/   (   R_   t   muRc   Ra   (   RX   (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyt
   init_rank1l  s    
c         ` sq   t  j |  d ƒ } t  j t  j |  d d ƒƒ } t  j j ˆ  | j ƒ } t  j | t  j j	 | ƒ j
 ƒ | } | S(   Ni    t   rowvar(   R   RC   t
   atleast_2dRc   RQ   Rd   R7   t   dott   linalgt   choleskyt   T(   R_   Re   Rc   Ra   (   RX   (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyt
   init_ranknt  s
    %c         ` sž   t  j |  d d ƒ} t  j j |  | d t ƒ\ } } } t  j j ˆ  | j ƒ } | d  d  … d  f | t  j	 |  j
 d d ƒ } t  j | | ƒ | } | S(   NR   i    t   full_matricesi   (   R   RC   Rj   t   svdR   RQ   Rd   R7   RO   R/   R)   Ri   (   R_   Re   t   _t   st   vhRa   t   sVh(   RX   (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyt   init_rank_def~  s    %2i   i    N(   R   R&   R)   (   R_   RX   Rf   Rm   Rt   t   nd(    (   RX   s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyt
   _krandinit\  s    
	

RQ   t   pointsc           C` s   t  j d ƒ d S(   s   Print a warning when called.sK   One of the clusters is empty. Re-run kmean with a different initialization.N(   R   R   (    (    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyt   _missing_warn’  s    c           C` s   t  d ƒ ‚ d S(   s!   raise a ClusterError when called.sK   One of the clusters is empty. Re-run kmean with a different initialization.N(   R   (    (    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyt   _missing_raise˜  s    R   t   raisei
   c         C` sA  t  |  d | ƒ}  | t k r7 t d t | ƒ ƒ ‚ n  t j |  ƒ } | d k r[ d } n( | d k rw |  j d } n t d ƒ ‚ t j |  ƒ d k  r§ t d ƒ ‚ n  t j | ƒ d k sÈ | d k r@| t j | ƒ k sì t d ƒ ‚ n  | d k rt | ƒ }	 n* | j \ }	 }
 |
 | k s1t d	 ƒ ‚ n  | j	 ƒ  } nÂ y t
 | ƒ }	 Wn' t k
 ryt d
 t | ƒ ƒ ‚ n X|	 d k  rŸt d t | ƒ ƒ ‚ n  |	 | k s»t j d ƒ n  y t | } Wn' t k
 ròt d t | ƒ ƒ ‚ n X| |  | ƒ } t
 | ƒ d k  r't d | ƒ ‚ n  t |  | | |	 t | ƒ S(   s…  
    Classify a set of observations into k clusters using the k-means algorithm.

    The algorithm attempts to minimize the Euclidian distance between
    observations and centroids. Several initialization methods are
    included.

    Parameters
    ----------
    data : ndarray
        A 'M' by 'N' array of 'M' observations in 'N' dimensions or a length
        'M' array of 'M' one-dimensional observations.
    k : int or ndarray
        The number of clusters to form as well as the number of
        centroids to generate. If `minit` initialization string is
        'matrix', or if a ndarray is given instead, it is
        interpreted as initial cluster to use instead.
    iter : int, optional
        Number of iterations of the k-means algrithm to run. Note
        that this differs in meaning from the iters parameter to
        the kmeans function.
    thresh : float, optional
        (not used yet)
    minit : str, optional
        Method for initialization. Available methods are 'random',
        'points', and 'matrix':

        'random': generate k centroids from a Gaussian with mean and
        variance estimated from the data.

        'points': choose k observations (rows) at random from data for
        the initial centroids.

        'matrix': interpret the k parameter as a k by M (or length k
        array for one-dimensional data) array of initial centroids.
    missing : str, optional
        Method to deal with empty clusters. Available methods are
        'warn' and 'raise':

        'warn': give a warning and continue.

        'raise': raise an ClusterError and terminate the algorithm.
    check_finite : bool, optional
        Whether to check that the input matrices contain only finite numbers.
        Disabling may give a performance gain, but may result in problems
        (crashes, non-termination) if the inputs do contain infinities or NaNs.
        Default: True

    Returns
    -------
    centroid : ndarray
        A 'k' by 'N' array of centroids found at the last iteration of
        k-means.
    label : ndarray
        label[i] is the code or index of the centroid the
        i'th observation is closest to.

    R   s   Unkown missing method: %si   i   s   Input of rank > 2 not supporteds   Input has 0 items.t   matrixs/   k is not an int and has not same rank than datasF   k is not an int and has not same rank than                        datas,   k (%s) could not be converted to an integer s#   kmeans2 for 0 clusters ? (k was %s)s$   k was not an integer, was converted.s   unknown init method %ss9   iter = %s is not valid.  iter must be a positive integer.(   R   t   _valid_miss_methR'   t   strR   R&   R)   R7   RF   R   R+   RP   R   R   t   _valid_init_metht   KeyErrort   _kmeans2(   R_   RX   RW   RH   t   minitt   missingR   Ru   R1   R9   t   dct   clusterst   init(    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyR      sJ    <	!c   	      C` s{   xn t  | ƒ D]` } t |  | ƒ d } t j |  | | ƒ \ } } | j ƒ  sg | ƒ  | | | | <n  | } q W| | f S(   se    "raw" version of kmeans2. Do not use directly.

    Run k-means with a given initial codebook.

    i    (   R,   R   R
   RD   t   all(	   R_   R2   t   niterR9   R‚   R4   t   labelt   new_codeRL   (    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyR€     s    
("   t   __doc__t
   __future__R    R   R   t   __docformat__t   __all__R   t   numpyR   t   scipy._lib._utilR   t
   scipy._libR	   t    R
   t	   ExceptionR   R@   R   R   R    R(   R>   RM   R   Rb   Rv   R~   Rx   Ry   R|   R   R€   (    (    (    s/   /tmp/pip-build-7oUkmx/scipy/scipy/cluster/vq.pyt   <module>D   s2   	8LF	!83Œ		3			o