Methods used for finding patterns in data:

- Cluster analysis – algorithm finds groups of similar data points by examining distance between points, density, ranges etc. Models for cluster analysis:
- connectivity – organizes points based on how close they are to each other
- partitioning – each data point is associated into some cluster (mostly commonly used algorithm is “K-means”)
- distribution model – uses statistical distribution
- density analysis – basd on how close points are – DBSCAN – for highly concetrated data, OPTICS for more broad distribution
- Cluster cen be:
- hard – every point only in one cluster
- soft – point can be in more clusteres

- Rules for partitioning – strict, overlapping, hierarchical

- Detection of anomalies
- Association rules – represent set of decisions which can be made based on data we have.