Decoding Model
A decoding model is one that predicts the next symbol in the sequence given the previously generated symbols.
Autoregressive language model
Anisotropic model.
Isotropic model.
Contrastive search
instance discrimination,
which trains the network to discriminate between different instances of the same class
cross-view triplet loss,
which compares the similarity of two views of the same instance.
Maximal Marginal Relevance (MMR)
– identifies the most distinctive element that still satisfies a certain relevance criteria.
Determinantal Point Processes (DPP)
– models the diversity of elements in a dataset as a determinant, and selects the elements with the highest determinant.
Core-Set Selection –
aims to find the smallest subset of a dataset that preserves the representative information of the entire dataset.
Variational Information Bottleneck (VIB) –
seeks to find a compact representation of a dataset while preserving its essential information.
Greedy Algorithms=
iteratively selects elements that are highly contrasting or distinct, while ensuring that previously selected elements are still relevant.
Greedy algorithm
degenerative expression
texts generated in a repetitive, nonsensical or inconsistent manner–
Semantic consistency
It involves evaluating the generated text for grammatical correctness, coherence, and relevance to the context and ensuring that it aligns with the meaning and intent of the input. In NLP, semantic consistency is an important aspect of model evaluation and is c.
Beam search
Beam-search is a search algorithm used in natural language processing and machine translation applications to generate the most likely sequence of words or tokens given a partial sequence as input. It works by keeping track of the k most likely candidate sequences (called beams) at each time step, instead of just one sequence as in greedy search. The algorithm then selects the next word for each beam based on probabilities given by the language model, and narrows down the k beams to only the k most likely sequences. This process is repeated until a stopping condition is met, such as generating a specific number of tokens or reaching an end-of-sequence marker. Beam-search balances between precision and recall, generating more accurate outputs than greedy search but also allowing for exploration of other possible sequences.
Nucleus sampling
Nucleus Sampling is a technique in Natural Language Processing (NLP) that is used to generate text. It is an alternative to the widely used greedy decoding method, also known as beam search. Nucleus Sampling works by restricting the number of options generated during decoding to only the most likely ones (the top “nucleus” of the probability distribution) and randomly selecting one of them, rather than always selecting the most likely option as greedy decoding does. This introduces randomness and diversity in the generated text and can lead to more creative and nuanced outputs.
Nucleus sampling is a method in Natural Language Processing (NLP) to generate diverse and high-quality text as output, by sampling from the most likely next words given a current input context. The “nucleus” refers to a selected subset of the most likely next words, from which the final word to use as the next step in the generation process is sampled. The nucleus is defined by a probability threshold, and the words with probabilities above this threshold are included in the nucleus. This approach aims to balance the diversity and quality of the generated text by controlling the trade-off between choosing the most likely word and exploring less frequent but more diverse options.