JMinHEP - data mining framework for high-energy physics


JMinHEP is a framework for clustering analysis, i.e. for non-supervised learning in which the classification process does not depend on a priory information . It was designed mainly for high-energy physics community. However, of course, everyone is welcome to use it. The program is a pure JAVA-based application and includes the following algorithms:
  1. K-means clustering analysis (single and multi pass)
  2. C-means (fuzzy) algorithm
  3. Agglomerative hierarchical clustering
  4. .. more will be included soon
More information can be found in en.wikipedia.org or this tutorial.. The algorithms can run for a fixed cluster mode and for a best estimate, i.e. when the number of clusters is not a priory given but is found after estimation of the cluster compactness. The data points can be defined in multidimensional space. At present, the distance measure is euclidean.

Download: JMinHEP.tar.gz
Then unzip and untar it (tar -zvxf JMinHEP.tar.gz under Linux/Unix). This will create a directory: jminhep with JMinHEP.jar file. You can run it as usual, i.e. java -jar JMinHEP.jar


The program can be run:

JMinHEP GUI mode

Just run it as:

java -jar JMinHEP.jar

and load any ARFF file. Some example files can be found here:  iris.arff  or my.arff. Also, you can load the data from the prompt:

java -jar JMinHEP.jar iris.arff

Screenshot of JMinHEP


JMinHEP embedded

You can include JMinHEP.jar to your application. Look at the example application located in the "example" directory. You need to include JMinHEP.jar to the JAVA classpath to compile it.

In short, you need just the statement in your code:

include jminhep.clanalyse.*;

Then load the data to the dataHolder. The Partition class does the clustering. Then you can run any cluster algorithm
depending on input mode (the correct mode is shown in the status bar of GUI). You can access all output information by calling the methods: getName(), .getCompactness(), getNclusters(), getCenters(), getClusterNumber(). The example program runs over all possible clustering modes and then print the final result. Read API to learn more about the Partition  and dataHolder classes here.

Note: JMinHEP is not completely free software. Read the JMinHEP License. The package is based on free JFreeChart package by Object Refinery Limited and Contributors. JFreeChart is licensed under the terms of the GNU Lesser General Public Licence (LGPL).

You can send your algorithm to me for inclusion, if you will follow the codding standard given by the dataHolder and Partitioner standard.


S.Chekanov