Class ClusterFinder
- java.lang.Object
-
- com.astrolabsoftware.FinkBrowser.HBaser.Clusteriser.ClusterFinder
-
public class ClusterFinder extends java.lang.Object
ClusterFinder
identifies HBase rows with clusters defined by previous clustering algorithm, read from JSON model files.- Author:
- J.Hrivnac
-
-
Field Summary
Fields Modifier and Type Field Description private org.apache.commons.math3.linear.RealMatrix
_clusterCenters
private double[]
_explainedVariance
private double[]
_mean
private org.apache.commons.math3.linear.RealMatrix
_pcaComponents
private static double
_separation
private double[]
_std
private static org.apache.logging.log4j.Logger
log
Logging .
-
Constructor Summary
Constructors Constructor Description ClusterFinder(java.lang.String scalerFile, java.lang.String pcaFile, java.lang.String clustersFile)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private double[]
applyPCA(double[] standardizedInput)
private int
findClosestCluster(double[] transformedData)
Find the closest cluster from the transformed data.private void
loadClusterCenters(java.lang.String filePath)
private void
loadPCAParams(java.lang.String filePath)
private void
loadScalerParams(java.lang.String filePath)
static void
main(java.lang.String[] args)
private static void
setSeparation(double separation)
Set the minimal separation quotient.private double[]
standardize(double[] input)
int
transformAndPredict(double[] inputData)
Transform provided data array and find the closest cluster.
-
-
-
Field Detail
-
_separation
private static double _separation
-
_mean
private double[] _mean
-
_std
private double[] _std
-
_pcaComponents
private org.apache.commons.math3.linear.RealMatrix _pcaComponents
-
_explainedVariance
private double[] _explainedVariance
-
_clusterCenters
private org.apache.commons.math3.linear.RealMatrix _clusterCenters
-
log
private static org.apache.logging.log4j.Logger log
Logging .
-
-
Constructor Detail
-
ClusterFinder
public ClusterFinder(java.lang.String scalerFile, java.lang.String pcaFile, java.lang.String clustersFile) throws java.io.IOException
- Throws:
java.io.IOException
-
-
Method Detail
-
main
public static void main(java.lang.String[] args) throws java.io.IOException
- Throws:
java.io.IOException
-
loadScalerParams
private void loadScalerParams(java.lang.String filePath) throws java.io.IOException
- Throws:
java.io.IOException
-
loadPCAParams
private void loadPCAParams(java.lang.String filePath) throws java.io.IOException
- Throws:
java.io.IOException
-
loadClusterCenters
private void loadClusterCenters(java.lang.String filePath) throws java.io.IOException
- Throws:
java.io.IOException
-
standardize
private double[] standardize(double[] input)
-
applyPCA
private double[] applyPCA(double[] standardizedInput)
-
findClosestCluster
private int findClosestCluster(double[] transformedData)
Find the closest cluster from the transformed data.- Parameters:
transformedData
- The transformed input data.- Returns:
- The (number of) the closest cluster. -1 if it cannot be found with sufficient resolution.
-
transformAndPredict
public int transformAndPredict(double[] inputData)
Transform provided data array and find the closest cluster.- Parameters:
inputData
- The original input data.- Returns:
- The (number of) the closest cluster. -1 if it cannot be found with sufficient resolution.
-
setSeparation
private static void setSeparation(double separation)
Set the minimal separation quotient.- Parameters:
separation
- The minimal separation quotient. The ration between distance to closest and second closest cluster should be smaller than separation, otherwise cluster is not considered reliable. 1 gives no restriction. The default is 0.5.
-
-