Class CustomWordEmbedding

java.lang.Object
org.elasticsearch.client.ml.inference.preprocessing.CustomWordEmbedding
All Implemented Interfaces:
NamedXContentObject, PreProcessor, ToXContent, ToXContentObject

public class CustomWordEmbedding
extends java.lang.Object
implements PreProcessor
This is a pre-processor that embeds text into a numerical vector. It calculates a set of features based on script type, ngram hashes, and most common script values. The features are then concatenated with specific quantization scales and weights into a vector of length 80. This is a fork and a port of: https://github.com/google/cld3/blob/06f695f1c8ee530104416aab5dcf2d6a1414a56a/src/embedding_network.cc