algo

k-means clustering algorithm

k-means is one of the simplest unsupervised learning algorithms that solve the well known clustering problem. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters) fixed apriori. The main idea is to define k centers, one for each cluster. These centers should be placed in a cunning way because of different location causes different result. So, the better choice is to place them as much as possible far away from each other. The next step is to take each point belonging to a given data set and associate it to the nearest center. When no point is pending, the first step is completed and an early group age is done. At this point we need to re-calculate k new centroids as barycenter of the clusters resulting from the previous step. After we have these k new centroids, a new binding has to be done between the same data set points and the nearest new center. A loop has been generated. As a result of this loop we may notice that the k centers change their location step by step until no more changes are done or in other words centers do not move any more. Finally, this algorithm aims at minimizing an objective function know as squared error function given by:
kmeans.JPG
where,
‘||xi – vj||’ is the Euclidean distance between xi and vj. ‘ci’ is the number of data points in ith cluster.

‘c’ is the number of cluster centers.

Algorithmic steps for k-means clustering

Let X = {x1,x2,x3,……..,xn} be the set of data points and V = {v1,v2,…….,vc} be the set of centers.

1) Randomly select ‘c’ cluster centers.

2) Calculate the distance between each data point and cluster centers.

3) Assign the data point to the cluster center whose distance from the cluster center is minimum of all the cluster centers..

4) Recalculate the new cluster center using:

kmeans1.bmp

where, ‘ci’ represents the number of data points in ith cluster.

5) Recalculate the distance between each data point and new obtained cluster centers.

6) If no data point was reassigned then stop, otherwise repeat from step 3).

Advantages

1) Fast, robust and easier to understand.

2) Relatively efficient: O(tknd), where n is # objects, k is # clusters, d is # dimension of each object, and t is # iterations. Normally, k, t, d << n.

3) Gives best result when data set are distinct or well separated from each other.

k-means.jpg
Fig I: Showing the result of k-means for ‘N’ = 60 and ‘c’ = 3

Note: For more detailed figure for k-means algorithm please refer to k-means figure sub page.

Disadvantages

1) The learning algorithm requires apriori specification of the number of cluster centers.

2) The use of Exclusive Assignment – If there are two highly overlapping data then k-means will not be able to resolve that there are two clusters.

3) The learning algorithm is not invariant to non-linear transformationsi.e.with different representation of data we get

different results (data represented in form of cartesian co-ordinates and polar co-ordinates will give different results).

4) Euclidean distance measures can unequally weight underlying factors.5) The learning algorithm provides the local optima of the squared error function.

6) Randomly choosing of the cluster center cannot lead us to the fruitful result. Pl. refer Fig.

7) Applicable only when mean is defined i.e. fails for categorical data.

8) Unable to handle noisy data and outliers.

9) Algorithm fails for non-linear data set.

k-means_fail.jpg
Fig II: Showing the non-linear data set where k-means algorithm fails

References
1) An Efficient k-means Clustering Algorithm: Analysis and Implementation by Tapas Kanungo, David M. Mount,
Nathan S. Netanyahu, Christine D. Piatko, Ruth Silverman and Angela Y. Wu.
2) Research issues on K-means Algorithm: An Experimental Trial Using Matlab by Joaquin Perez Ortega, Ma. Del
Rocio Boone Rojas and Maria J. Somodevilla Garcia.
3) The k-means algorithm – Notes by Tan, Steinbach, Kumar Ghosh.
4) http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/kmeans.html
5) k-means clustering by ke chen.

Advertisements

Your logo has been nominated for Logo Of The Day


From: Dawson from Logojoy <dawson.whitfield>
Date: Sun, Apr 30, 2017 at 11:23 PM
Subject: Your logo has been nominated for Logo Of The Day
To: lednichenkoolga

File1487970248306

OLGA – your logo has been nominated for Logo Of The Day!

Congratulations! You’ve been selected as a nominee to win a Premium logo package. How does it work?

  1. Accept the nomination by clicking here.
  2. The logo with the most upvotes by the end of the day wins a Premium package.
  3. The top 10 logos win 20% off a Premium logo package.
  4. You can use your Premium package download for any logo, it doesn’t have to be the logo that won.

If you’d like to check out the other nominees, you can view them in our gallery by clicking here

Accept Nomination
73 Bathurst Avenue, Toronto, ON. Suite #401

Unsubscribe

Powered by Intercom

intercom

Someone just viewed: the wealth is not with Sunny or Katrina – the real wealth is to those who write a check TO sunny and Katrina

KATRINA IS WORTH ABOUT $ 10 MILLION

SUNNY IS WORTH ABOUT $ 2.5 MILLION

SANJAY DUTT IS WORTH ABOUT $18 MILLION

RICK PERRY THE ENERGY SECRETARY IS WORTH AROUND $3 MILLION

HILLARY CLINTON JI IS WORTH AROUND $30 MILLION

PRIYANKA CHPRA IS WORTH AROUND $10 – $15 MILLION

BIBI NETANYAHU IS WORTH AROUND $11 MILLION

 

RAHM EMANUEL THE MAYOR OF CHICAGO IS WORTH AROUND $ 14 MILLION

 

BUBBA IS WORTH AROUND $80 MILLION

BILL MAHER – THE COMEDIAN – IS WORTH AROUND $ 30 MILLION

 

 

TOP 1 % OF THE NET WORTH IN USA THERSHHOLD = $8 MILLION

TOP 0.1 % OF THE NET WORTH THRESHOLD IN THE USA = $20 MILLION

U KNOW HOW MUCH IS TOP 0.1 %? – IT MEANS ONE IN 1000

U KNOW HOW MANY PEOPLE IN USA PLUS CANADA HAVE NET WORTH OF $30 MILLION?

HERE IS THE ANSWER -ONLY 52,000 -YEAH ONLY 52 000 PEOPLE ..

 

HOW MANY STATES IN USA ? — ANSWER = 50

 

SO, THIS MEANS ONLY 50,000 PEOPLE IN USA – HAVE $30 MILLION OR MORE

HOW MANY STATES IN USA  ANSWER = 50

SO IF THE WEALTH IS EVENLY DISTRIBUTED AMONG SATES WHICH IT IS NOT – THEN – WE ARE TALKING 1000 PEOPLE PER STATE IN THE USA – AT THE MAXIMUM HAVE $30 MILLION

HOW MANY BIG CITIES IN EACH – SAY – AN AVERAGE STATE – ? – WELL, AT LEAST 5

SO, WE ARE TALKING ABOUT ONLY 200 PEOPLE IN EACH – CITY WILL HAVE $30 MILLION .. NET WORTH .. WHICH IS WAY BEYOND THE AVERGAE – BECAUSE MOST OF THE MILLIONIARES ARE IN CALIFORNIA, BOSTON, NEW YORK – FLORIDA ETC

BUT EVEN IF – ITS EVEN;LY DISRIBUTED, YOU WOULD BE SOMEWHERE LIKE ONE IN 100 – OF THE ENTIRE CITY POPULATION IN THE USA – TO HAVE THAT MUCH MONEY – IF YOU HAVE $30 MILLION

 

 

$10 MILLION IS A LOT OF MONEY

HOW?

WELL, SAY U CAN GENERATE 10 % RETURN FROM YOUR $10 MILLION IN ASSETS

THEN?

WELL, THEN 10 % OF $10 MILLION = $1 MILLION

GOLDMAN SACS BANKERS – AT AGE EQUAL TO 40 OR SO, EARN IN SALARY PLUS BONUS AROUND $ 700,000 TO $ 1 MILLION PER YEAR

MATCH.COM FOUNDER GRE IS WORTH AROUND $10 MILLION

BUT OF COURSE, YES, SOME APARTMENTS IN NEW YORK BY CENTRAL PARK ARE WORTH $60 MILLION – AND EVEN HOLLYWOOD CELEBRITY NATALIE PORTMAN IS WORTH $45 MILLION AND HENCE SHE AND HER HUSBAND INCLUDING EVEN BAR REFEALI – WHOS I WORTH $20 MILLION – CANT AFFORD A SINGLE APARTMENT – OF THAT CATEGORY BY CENTRAL PARK

LISTEN TO ME :

[1] MONEY IS A SUBJECTIVE THING

[2] MONEY IS YES, GOOD – SEPCIALLY IF U DONT HAVE ENOUGH FOR BASIC LEVELS OF MASLOW HEIRRARCHY, SUCH AS FOOD, SAFETY, HOUSING, ETC

[3] MONEY WITHOUT LOVE – IS – AND I KNOW ITS HARD TO BELIEVE SOMEONE LIKE ME – BUT IF U CAN BELIEVE ME – MONEY WTHOUT LOVE – IS ONLY USEFUL FOR 2 THINGS

[A] FOOD

[B] HOUSING

SO, IF YOUR FOOD AND HOUSING AND – OK – SAY TRAVEL – IS TAKEN CARE OF – WHICH IN MY CASE, WAS AT AGE EQUAL TO 26 – THEN – REALLY WHATEVER EXTRA U MAKE OR EARN IS – USUALLY NOT FOR U – BUT FOR THOSE WHOM U LOVE

WORDS LIKE KIDS, WIFE, MISTRESS ETC CCME TO MIND

On Sat, Apr 29, 2017 at 5:07 PM, Streak <notifications> wrote:

streak-text.png
Someone just viewed your email with the subject: the wealth is not with Sunny or Katrina – the real wealth is to those who write a check TO sunny and Katrina
Details
People on thread: OLGA SHULMAN LEDNICHENKO BLOG POST BY EMAIL
Device: Unknown Device
Location: taito, 13

we-hiring.png

eyJlbWFpbF9pZCI6Ik16RXdOVFU2RnZHaUFtUUFBbk1BRndNZHZ4b0JULUo0N21BVlhoaFpCSHJqQVdVNk1tSmlZMlZpWm1FdE1tTmtNQzB4TVdVM0xUZ3lOVEl0T0RFMU1XTmxNell6TW1Vd09qTTJPVE0wTXdBPSJ9

CIAO MADARCHOD :: OLGA AJAY SANJAY DUTT “HI” URDU +92

https://www.google.co.in/search?q=AJAY++MISHRA+VERSUS+DAWOOD+IBRAHIM&num=30&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjnkZuOpMjTAhUJa7wKHYneBgMQ_AUICygC&biw=1366&bih=662#imgdii=pRCDIZ8tVSPtPM:&imgrc=QhAqYTSZcHUtnM:

https://www.google.co.in/search?q=OLGA+AJAY+SANJAY+DUTT+%22HI%22+URDU+%2B92&num=30&source=lnms&tbm=isch&sa=X&ved=0ahUKEwj6uLzmqMjTAhXHxrwKHa7eB_QQ_AUICigB&biw=1366&bih=662#imgrc=P_gC2ZS8oZwJAM:

https://www.google.co.in/search?q=OLGA+AJAY+SANJAY+DUTT+%22HI%22+URDU+%2B92&num=30&source=lnms&tbm=isch&sa=X&ved=0ahUKEwj6uLzmqMjTAhXHxrwKHa7eB_QQ_AUICigB&biw=1366&bih=662#imgdii=xaBTmmdviVd5AM:&imgrc=pks-NLTphyeqsM: