Class DataDescription

java.lang.Object
org.elasticsearch.client.ml.job.config.DataDescription
All Implemented Interfaces:
org.elasticsearch.common.xcontent.ToXContent, org.elasticsearch.common.xcontent.ToXContentObject

public class DataDescription
extends java.lang.Object
implements org.elasticsearch.common.xcontent.ToXContentObject
Describes the format of the data used in the job and how it should be interpreted by the ML job.

getTimeField() is the name of the field containing the timestamp and getTimeFormat() is the format code for the date string in as described by DateTimeFormatter.

  • Nested Class Summary

    Nested Classes
    Modifier and Type Class Description
    static class  DataDescription.Builder  
    static class  DataDescription.DataFormat
    Enum of the acceptable data formats.

    Nested classes/interfaces inherited from interface org.elasticsearch.common.xcontent.ToXContent

    org.elasticsearch.common.xcontent.ToXContent.DelegatingMapParams, org.elasticsearch.common.xcontent.ToXContent.MapParams, org.elasticsearch.common.xcontent.ToXContent.Params
  • Field Summary

    Fields
    Modifier and Type Field Description
    static char DEFAULT_DELIMITER
    The default field delimiter expected by the native autodetect program.
    static char DEFAULT_QUOTE_CHAR
    The default quote character used to escape text in delimited data formats
    static java.lang.String DEFAULT_TIME_FIELD
    By default autodetect expects the timestamp in a field with this name
    static java.lang.String EPOCH
    Special time format string for epoch times (seconds)
    static java.lang.String EPOCH_MS
    Special time format string for epoch times (milli-seconds)
    static org.elasticsearch.common.xcontent.ObjectParser<DataDescription.Builder,​java.lang.Void> PARSER  

    Fields inherited from interface org.elasticsearch.common.xcontent.ToXContent

    EMPTY_PARAMS
  • Constructor Summary

    Constructors
    Constructor Description
    DataDescription​(DataDescription.DataFormat dataFormat, java.lang.String timeFieldName, java.lang.String timeFormat, java.lang.Character fieldDelimiter, java.lang.Character quoteCharacter)  
  • Method Summary

    Modifier and Type Method Description
    boolean equals​(java.lang.Object other)
    Overridden equality test
    java.lang.Character getFieldDelimiter()
    If the data is in a delimited format with a header e.g.
    DataDescription.DataFormat getFormat()
    The format of the data to be processed.
    java.lang.Character getQuoteCharacter()
    The quote character used in delimited formats.
    java.lang.String getTimeField()
    The name of the field containing the timestamp
    java.lang.String getTimeFormat()
    Either "epoch", "epoch_ms" or a SimpleDateTime format string.
    int hashCode()  
    org.elasticsearch.common.xcontent.XContentBuilder toXContent​(org.elasticsearch.common.xcontent.XContentBuilder builder, org.elasticsearch.common.xcontent.ToXContent.Params params)  

    Methods inherited from class java.lang.Object

    clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.elasticsearch.common.xcontent.ToXContentObject

    isFragment
  • Field Details

    • EPOCH

      public static final java.lang.String EPOCH
      Special time format string for epoch times (seconds)
      See Also:
      Constant Field Values
    • EPOCH_MS

      public static final java.lang.String EPOCH_MS
      Special time format string for epoch times (milli-seconds)
      See Also:
      Constant Field Values
    • DEFAULT_TIME_FIELD

      public static final java.lang.String DEFAULT_TIME_FIELD
      By default autodetect expects the timestamp in a field with this name
      See Also:
      Constant Field Values
    • DEFAULT_DELIMITER

      public static final char DEFAULT_DELIMITER
      The default field delimiter expected by the native autodetect program.
      See Also:
      Constant Field Values
    • DEFAULT_QUOTE_CHAR

      public static final char DEFAULT_QUOTE_CHAR
      The default quote character used to escape text in delimited data formats
      See Also:
      Constant Field Values
    • PARSER

      public static final org.elasticsearch.common.xcontent.ObjectParser<DataDescription.Builder,​java.lang.Void> PARSER
  • Constructor Details

    • DataDescription

      public DataDescription​(DataDescription.DataFormat dataFormat, java.lang.String timeFieldName, java.lang.String timeFormat, java.lang.Character fieldDelimiter, java.lang.Character quoteCharacter)
  • Method Details

    • toXContent

      public org.elasticsearch.common.xcontent.XContentBuilder toXContent​(org.elasticsearch.common.xcontent.XContentBuilder builder, org.elasticsearch.common.xcontent.ToXContent.Params params) throws java.io.IOException
      Specified by:
      toXContent in interface org.elasticsearch.common.xcontent.ToXContent
      Throws:
      java.io.IOException
    • getFormat

      public DataDescription.DataFormat getFormat()
      The format of the data to be processed. Defaults to DataDescription.DataFormat.XCONTENT
      Returns:
      The data format
    • getTimeField

      public java.lang.String getTimeField()
      The name of the field containing the timestamp
      Returns:
      A String if set or null
    • getTimeFormat

      public java.lang.String getTimeFormat()
      Either "epoch", "epoch_ms" or a SimpleDateTime format string. If not set (is null or an empty string) or set to "epoch_ms" (the default) then the date is assumed to be in milliseconds from the epoch.
      Returns:
      A String if set or null
    • getFieldDelimiter

      public java.lang.Character getFieldDelimiter()
      If the data is in a delimited format with a header e.g. csv or tsv this is the delimiter character used. This is only applicable if getFormat() is DataDescription.DataFormat.DELIMITED. The default value for delimited format is 9.
      Returns:
      A char
    • getQuoteCharacter

      public java.lang.Character getQuoteCharacter()
      The quote character used in delimited formats. The default value for delimited format is 34.
      Returns:
      The delimited format quote character
    • equals

      public boolean equals​(java.lang.Object other)
      Overridden equality test
      Overrides:
      equals in class java.lang.Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class java.lang.Object