General information about Unsupervise Feature Ranking server

The server is intended for identification of irrelevant features. Input is a data file with instances and output is a list of 10 most relevant features and a file with instances described by a set of 10-1000 most relevant features. Available is also a file with the list of included features denoted as A1-A10000.

Features may be either numerical or nominal. Unknown values are fine. Relevant features are recognized so that unsupervised task is transformed to a supervised task in which positive examples are original instances submitted by the user and negative examples are obtained by random shuffling of positive examples. So prepared data enter into a supervised learning process consisting of generation of many random rules. The goal to identify most useful features for discriminating original and randomized examples.

Preparation of a data files in the appropriate form is the most critical part of using this service. Please read the instructions very carefully. For practical reasons the server can accept only data files with up to 25000 instances and 10000 attributes. Maximal execution time limited to 5 minutes and for large files it may happen that that you will receive no response.

After clustering all uploaded data are be removed from the server. If another experiment with the same data is needed, the data file(s) must be resubmitted.

Security information

The system will not record any user data but it will also not include any special security properties. Theoretically, the user has no guarantee that his/her data will not be read and stored by system or perhaps even by other users of the server. In cases when this fact may be the problem for the user, it is his/her responsibility to code learning examples so that it is not possible to reconstruct the important private data. Generally, this is not a difficult task.

© 2016 LIS - Rudjer Boskovic Institute
Last modified: January 11 2016 15:22:52.