General information about Multilayer Clustring server

The server constructs relevant clusters of instances. Input is a data file with instances and output is a set of clusters defined by instances included into these clusters. The server implements a multilayer approach to clustering. It means you can have one or two layers of attributes that describe instances. Each layer must be prepared as a separate data file. The number and the order of instances must be identical in both layers.

Attributes may be either numerical or nominal. The goal of clustering is to group together instances with similar attribute values. Similarity of instances is estimated by a process of supervised learning aimed at discriminating real instances from random instances. Clusters are defined as maximal groups of instances that enable reduction of the variability of estimated similarity values. In the multilayer approach the reduction must be present in both layers. Because of that resulting clusters are smaller but more coherent.

Optionally you may prepare and upload a file with names of instances. Each name must be in its row and the number of rows must be identical to the number of of instances in both data files. If no file with names of instances is uploaded then examples are referenced by their position in the data files.

Preparation of a data files in the appropriate form is the most critical part of using this service. Please read the instructions very carefully. For practical reasons the server can accept only data files with up to 1000 instances and 1000 attributes. Currently the server is slow and it is suggested to use it only with up to 250 instances..

After clustering all uploaded data are be removed from the server. If another experiment with the same data is needed, the data file(s) must be resubmitted.

Security information

The system will not record any user data but it will also not include any special security properties. Theoretically, the user has no guarantee that his/her data will not be read and stored by system or perhaps even by other users of the server. In cases when this fact may be the problem for the user, it is his/her responsibility to code learning examples so that it is not possible to reconstruct the important private data. Generally, this is not a difficult task.

© 2013 LIS - Rudjer Boskovic Institute
Last modified: September 11 2015 13:37:40.