KeyBERT

Interessante tecnica per l'estrazione delle parole chiave da un testo, che consiste in (quote direttamente dall'implementazione):

A minimal method for keyword extraction with BERT
The keyword extraction is done by finding the sub-phrases in
a document that are the most similar to the document itself.
First, document embeddings are extracted with BERT to get a
document-level representation. Then, word embeddings are extracted
for N-gram words/phrases. Finally, we use cosine similarity to find the
words/phrases that are the most similar to the document.
The most similar words could then be identified as the words that
best describe the entire document.

Collegamenti

https://github.com/MaartenGr/KeyBERT?tab=readme-ov-file

https://towardsdatascience.com/keyword-extraction-with-bert-724efca412ea