Class AvroImporter

java.lang.Object
com.Lomikel.Januser.JanusClient
com.astrolabsoftware.FinkBrowser.Avro.AvroImporter
All Implemented Interfaces:
ModifyingGremlinClient
Direct Known Subclasses:
HDFSAvroImporter

public class AvroImporter extends JanusClient
AvroImporter imports Avro files into JanusGraph.
Author:
J.Hrivnac
  • Field Details

  • Constructor Details

    • AvroImporter

      public AvroImporter(String properties, int reportLimit, int commitLimit, String strategy, String fitsDir, String hbaseUrl, String dataType)
      Create with JanusGraph properties file.
      Parameters:
      properties - The file with the complete Janusgraph properties.
      reportLimit - The number of events to use for progress report (-1 means no report untill the end).
      commitLimit - The number of events to commit in one step (-1 means commit only at the end).
      strategy - The creation strategy. drop,replace,getOrCreate.
      fitsDir - The directory for FITS files. If null or empty, FITS are included in the Graph. Ignored if HBase url set.
      hbaseUrl - The url for HBase table with full data as ip:port:table:schema. May be null or empty.
      dataType - The data type, alert|pca. If null, then considering as alert.
  • Method Details

    • main

      public static void main(String[] args) throws IOException
      Import Avro files or directory.
      Parameters:
      args - [0] The Janusgraph properties file.
      args - [1] The Avro file or directory with Avro files.
      args - [2] The directory for FITS files. If null or empty, FITS are included in the Graph. Ignored if HBase url set.
      args - [3] The url for HBase table with full data as ip:port:table:schema. May be null or empty.
      args - [4] The number of events to use for progress report (-1 means no report untill the end).
      args - [5] The number of events to commit in one step (-1 means commit only at the end).
      args - [6] The creation strategy. create,drop,replace,skip.
      args - [7] The data type, alert|pca. If null, then considering as alert.
      Throws:
      LomikelException - If anything goes wrong.
      IOException
    • processDir

      public void processDir(String dirFN, String fileExt) throws IOException
      Process directory with Avro alert files (recursive).
      Parameters:
      dirFN - The dirname of directiory with data file.
      fileExt - The file extention.
      Throws:
      IOException - If problem with file reading.
    • process

      public void process(String fn) throws IOException, LomikelException
      Process Avro alert file or directory with files (recursive).
      Parameters:
      fn - The filename of the data file or directory with files.
      Throws:
      IOException - If problem with file reading.
      LomikelException - If anything wrong.
    • register

      public void register(String fn)
      Register Import Vertex.
      Parameters:
      fn - The filename of the data file or directory with files.
    • processFile

      public void processFile(File file) throws IOException
      Process Avro alert file .
      Parameters:
      file - The data file.
      Throws:
      IOException - If problem with file reading.
    • processRecord

      public void processRecord(GenericRecord record)
      Process GenericRecord of the requested type.
      Parameters:
      record - The GenericRecord to be rocessed.
    • processAlert

      Process Avro alert.
      Parameters:
      record - The full alert GenericRecord.
      Returns:
      The created Vertex.
    • processPCA

      public Vertex processPCA(GenericRecord record)
      Process Avro PCA. Only create PCAs for already existing sources. Do not verify if the PCA already exists (so multiple PCAs fvor one source may be created.
      Parameters:
      record - The full PCA GenericRecord.
      Returns:
      The created Vertex.
    • processGenericRecord

      private Vertex processGenericRecord(GenericRecord record, String name, String idName, boolean tryDirection, Vertex mother, String edgeName, List<String> fields, String objectId)
      Process Avro GenericRecord.
      Parameters:
      record - The GenericRecord to process.
      name - The name of new Vertex.
      idName - The name of the unique identifying field.
      tryDirection - Whether try created Direction property from ra,dec fields.
      mother - The mother Vertex.
      edgerName - The name of the edge to the mother Vertex.
      fields - The list of fields to fill. All fields are filled if null
      objectId - The objectId of the containing source.
      Returns:
      The created Vertex.
    • processCutout

      private void processCutout(GenericRecord record, Vertex mother, String objectId, String jd)
      Process Avro cutout.
      Parameters:
      record - The GenericRecord to process.
      mother - The Vertex to attach to.
      jd - The jd of the corresponding candidate.
    • getSimpleValues

      private Map<String,String> getSimpleValues(GenericRecord record, List<String> fields)
      Register part of GenericRecord in HBase.
      Parameters:
      record - The GenericRecord to be registered in HBase.
      fields - The fields to be mapped.
    • getSimpleFields

      private List<String> getSimpleFields(GenericRecord record, List<String> keeps, String[] avoids)
      Get Schema.Fields corresponding to simple types and having non-null values.
      Parameters:
      record - The GenericRecord to use.
      keeps - The GenericRecord to report. Report all if null.
      avoids - The array of fields names not to report. Cannot cancel keeps argument.
      Returns:
      The list of coressponding fields.
    • vertex

      private Vertex vertex(GenericRecord record, String label, String property, String strategy)
      Create or drop a Vertex according to chosen strategy.
      Parameters:
      record - The full GenericRecord.
      label - The Vertex label.
      property - The name of Vertex property. If null strategy is ignored and Vertex is created.
      strategy - The creation strategy: drop, replace, reuse, skip, create. If anything else, the global strategy b is used.
      Returns:
      The created Vertex or null.
    • writeFits

      protected void writeFits(String fn, byte[] data)
      Write FITS file.
      Parameters:
      fn - The FITS file name.
      data - The FITS file content.
    • fitsDir

      protected String fitsDir()
      The directory for FITS files.
      Returns:
      The FITS file directory.
    • hbaseUrl

      protected String hbaseUrl()
      The data HBase table url.
      Returns:
      The data HBase table url.
    • n

      protected int n()
      Give number of created alerts/pcas.
      Returns:
      The number of created alerts/pcas.
    • skip

      protected boolean skip()
      Tell, whether import shpuld be skipped.
      Returns:
      Whether import shpuld be skipped.
    • close

      public void close()
      Description copied from interface: ModifyingGremlinClient
      Close graph.
      Specified by:
      close in interface ModifyingGremlinClient
      Overrides:
      close in class JanusClient
    • now

      protected void now()
      Set new Date.