Class DataDescription

java.lang.Object
org.elasticsearch.client.ml.job.config.DataDescription
All Implemented Interfaces:
org.elasticsearch.xcontent.ToXContent, org.elasticsearch.xcontent.ToXContentObject

public class DataDescription extends Object implements org.elasticsearch.xcontent.ToXContentObject
Describes the format of the data used in the job and how it should be interpreted by the ML job.

getTimeField() is the name of the field containing the timestamp and getTimeFormat() is the format code for the date string in as described by DateTimeFormatter.

  • Field Details

    • EPOCH

      public static final String EPOCH
      Special time format string for epoch times (seconds)
      See Also:
    • EPOCH_MS

      public static final String EPOCH_MS
      Special time format string for epoch times (milli-seconds)
      See Also:
    • DEFAULT_TIME_FIELD

      public static final String DEFAULT_TIME_FIELD
      By default autodetect expects the timestamp in a field with this name
      See Also:
    • DEFAULT_DELIMITER

      public static final char DEFAULT_DELIMITER
      The default field delimiter expected by the native autodetect program.
      See Also:
    • DEFAULT_QUOTE_CHAR

      public static final char DEFAULT_QUOTE_CHAR
      The default quote character used to escape text in delimited data formats
      See Also:
    • PARSER

      public static final org.elasticsearch.xcontent.ObjectParser<DataDescription.Builder,Void> PARSER
  • Constructor Details

  • Method Details

    • toXContent

      public org.elasticsearch.xcontent.XContentBuilder toXContent(org.elasticsearch.xcontent.XContentBuilder builder, org.elasticsearch.xcontent.ToXContent.Params params) throws IOException
      Specified by:
      toXContent in interface org.elasticsearch.xcontent.ToXContent
      Throws:
      IOException
    • getFormat

      public DataDescription.DataFormat getFormat()
      The format of the data to be processed. Defaults to DataDescription.DataFormat.XCONTENT
      Returns:
      The data format
    • getTimeField

      public String getTimeField()
      The name of the field containing the timestamp
      Returns:
      A String if set or null
    • getTimeFormat

      public String getTimeFormat()
      Either "epoch", "epoch_ms" or a SimpleDateTime format string. If not set (is null or an empty string) or set to "epoch_ms" (the default) then the date is assumed to be in milliseconds from the epoch.
      Returns:
      A String if set or null
    • getFieldDelimiter

      public Character getFieldDelimiter()
      If the data is in a delimited format with a header e.g. csv or tsv this is the delimiter character used. This is only applicable if getFormat() is DataDescription.DataFormat.DELIMITED. The default value for delimited format is '\t'.
      Returns:
      A char
    • getQuoteCharacter

      public Character getQuoteCharacter()
      The quote character used in delimited formats. The default value for delimited format is '\"'.
      Returns:
      The delimited format quote character
    • equals

      public boolean equals(Object other)
      Overridden equality test
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object