deepmatcher.attr_summarizers

Defines built-in attribute summarizers.

SIF

class deepmatcher.attr_summarizers.SIF(word_contextualizer=None, word_comparator=None, word_aggregator=None, hidden_size=None)[source]

The attribute summarizer for the SIF (Smooth Inverse Frequency) model.

Parameters:
  • word_contextualizer (string or WordContextualizer or callable) – The word contextualizer module (refer to WordContextualizer for details) to use for attribute summarization. The SIF model does not take word context information into account, hence this defaults to None.
  • word_comparator (string or WordComparator or callable) – The word comparator module (refer to WordComparator for details) to use for attribute summarization. The SIF model does not perform word by word comparisons, hence this defaults to None.
  • word_aggregator (string or WordAggregator or callable) – The word aggregator module (refer to WordAggregator for details) to use for attribute summarization. This model uses SIF-based weighted average aggregation over the word embeddings of an input sequence, hence this defaults to ‘sif-pool’.
  • hidden_size (int) – The hidden size to use for all 3 attribute summarization sub-modules (i.e., word contextualizer, word comparator, and word aggregator), if they are customized. By default, the SIF model does not use this parameter.

RNN

class deepmatcher.attr_summarizers.RNN(*args, **kwargs)[source]

The attribute summarizer for the RNN model.

Parameters:
  • word_contextualizer (string or WordContextualizer or callable) – The word contextualizer module (refer to WordContextualizer for details) to use for attribute summarization. This model uses RNN to take into account the context information, and the default value is ‘gru’ (i.e., uses the bidirectional GRU model as the specific RNN instantiation.) Other options are ‘rnn’ (the vanilla bi-RNN) and ‘lstm’ (the bi-LSTM model).
  • word_comparator (string or WordComparator or callable) – The word comparator module (refer to WordComparator for details) to use for attribute summarization. The RNN model does not perform word by word comparisons, hence this defaults to None.
  • word_aggregator (string or WordAggregator or callable) – The word aggregator module (refer to WordAggregator for details) to use for attribute summarization. The RNN model uses bi-directional RNN and concatenates the last ouputs of the forward and backward RNNs, hence the default value is ‘birnn-last-pool’.
  • hidden_size (int) – The hidden size to use for the word contextualizer. This value will also be used as the hidden size for the other 2 attribute summarization sub-modules (i.e., word comparator, and word aggregator), if they are customized. If not specified, the hidden size for each component will be set to be the same as its input size. E.g. if the word embedding dimension is 300 and hidden_size is None, the word contextualizer’s hidden size will be 300.

Attention

class deepmatcher.attr_summarizers.Attention(*args, **kwargs)[source]

The attribute summarizer for the attention-based model.

Parameters:
  • word_contextualizer (string or WordContextualizer or callable) – The word contextualizer module (refer to WordContextualizer for details) to use for attribute summarization. The attention model does not take word context information into account, hence this defaults to None.
  • word_comparator (string or WordComparator or callable) – The word comparator module (refer to WordComparator for details) to use for attribute summarization. The attention model performs word by word comparison with the decomposable attention mechanism, hence this defaults to ‘decomposable-attention’.
  • word_aggregator (string or WordAggregator or callable) – The word aggregator module (refer to WordAggregator for details) to use for attribute summarization. The Attention model performs the aggregation by summing over the comparison results from the word comparator, divided by the length of the input sequence (to get constant variance through the network flow). Hence this defaults to ‘divsqrt-pool’.
  • hidden_size (int) – The hidden size to use for the word comparator. This value will also be used as the hidden size for the other 2 attribute summarization sub-modules (i.e., word contextualizer, and word aggregator), if they are customized. If not specified, the hidden size for each component will be set to be the same as its input size. E.g. if the word embedding dimension is 300 and hidden_size is None, the word contextualizer’s hidden size will be 300.

Hybrid

class deepmatcher.attr_summarizers.Hybrid(*args, **kwargs)[source]

The attribute summarizer for the hybrid model.

Parameters:
  • word_contextualizer (string or WordContextualizer or callable) – The word contextualizer module (refer to WordContextualizer for details) to use for attribute summarization. The hybrid model uses bidirectional GRU(a specific type of RNN) to take into account the context information. The default value is ‘gru’.
  • word_comparator (string or WordComparator or callable) – The word comparator module (refer to WordComparator for details) to use for attribute summarization. The hybrid model performs word by word comparison over the raw input word embeddings (rather than the RNN hiddens states), hence this defaults to an Attention object with ‘decomposable’ as the attention mechanism on the raw input embeddings.
  • word_aggregator (string or WordAggregator or callable) – The word aggregator module (refer to WordAggregator for details) to use for attribute summarization. A second layer of attention has been used for the aggregation. Please consult the paper for more information. The default value is ‘concat-attention-with-rnn’.
  • hidden_size (int) – The hidden size to use for all 3 attribute summarization sub-modules (i.e., word contextualizer, word comparator, and word aggregator), if they are customized.