k-mean clustering and its use case in security domain

Prudhvi Kumar Danapana
2 min readAug 11, 2021

--

What does the k-mean clustering mean ?

k-means simply distance and clustering mean as group. So , simply k-mean clustering is an algorithm for finding the distance of the data points and allocating the data points to the nearest cluster/group. This type of algorithm comes under unsupervised learning.

Advantages of K-means

  1. It is very simple to implement.
  2. It is scalable to a huge data set and also faster to large datasets.
  3. It adapts the new examples very frequently.
  4. Generalization of clusters for different shapes and sizes.

Disadvantages of K-means

  1. It is sensitive to the outliers.
  2. Choosing the k values manually is a tough job.
  3. As the number of dimensions increases its scalability decreases.

Use case in security:

  • The simulation process is accomplished by conducting k-means clustering. The k-mean clustering is performed on crime data sets with the use of rapid data tool.
  • The simulation is carried out in steps. Firstly, a data set is obtained. Secondly, the obtained data set is filtered according to the requirements, and then, a new data set with the attributes according to the analysis to be conducted is created.
  • Thirdly, an open minor tool is opened and then the excel file read. The “Replace the Missing value” operator is then applied, and then the operation executed. Fourthly, the “Normalize operator” is performed on the resulting data set and then operation executed.
  • k-means clustering is finally performed on the resultant data set after the normalization process.
  • Finally, k-means clustering is then performed on the resultant data set after the normalization process. The analysis is then done on the cluster formed.

A use case in e-commerce:

Consider a a certain parcels that are going to deliver in different areas.

Here different areas can be considered as different groups. If the person who is going to deliver the parcels should be delivered to their registered address which comes to that particular area. So , the person who is going to deliver should deliver in that particular area (group) , if they deliver in different area(group) , a huge conflicts may occur which may leads reputation loss of the company.

--

--