Skip to main content

Questions tagged [natural-language]

Natural Language Processing is a set of techniques from linguistics, artificial intelligence, machine learning and statistics that aim at processing and understanding human languages.

1 vote
0 answers
12 views
+100

NER With Custom Tags, How to Approach

I am building a "field tagger" for documents. Basically, a document, in my case something like a proposal or sales quote, would have a bunch of entities scattered throughout it, and we want ...
redbull_nowings's user avatar
0 votes
0 answers
28 views

Normalizing the embedding space of an encoder language model with respect to categorical data

Suppose we have a tree/hierarchy of categories (e.g. categories of products in an e-commerce website), each node being assigned a title. Assume that the title of each node is semantically accurate, ...
mtcicero's user avatar
  • 123
0 votes
0 answers
9 views

Why learn an embedding before self attention when training transformers?

I understand that self-attention layers learn the "role" of a word in a sentence while embedding layers learn the relationship between the words. But I am not totally convinced that a self-...
Nicolas Johnson's user avatar
0 votes
0 answers
12 views

Log-likelihood calculation for unigrams

I am calculating the log-likelihood for each unigram that I generated by using the CountVectorizer to see each unigram's importance. However, I got all the positive value after calculating the log-...
Nick's user avatar
  • 1
4 votes
2 answers
534 views

Overfitting in randomForest model in R, WHY?

I am trying to train a Random Forest model in R for sentiment analysis. The model works with tf-idf matrix and learns from it how to classify a review, in positive or negative. Positive ones are ...
Anisa's user avatar
  • 43
0 votes
0 answers
20 views

Where does the equation $ C = 6 \times N \times T $ come from for Large Language Models, especially with a simple explanation for both passes?

Why $ C = 6 \times N \times T $? I'm trying to understand the computational steps specifically during the backward pass of neural networks in relation to the widely cited formula ( C = 6 \times N \...
Charlie Parker's user avatar
0 votes
0 answers
18 views

Can 3D convolutions appropriately capture a frozen embedding space?

My project is a strange combination of NLP and Computer Vision. I have datapoints of 3D tensor where each element is a token in an NLP vocabulary. The vocabulary is around 1000 unique "words"...
schmixi's user avatar
  • 43
0 votes
1 answer
26 views

Find event date given the probabilities of finding an event

I have a set of clinical notes with dates for each patient and an NLP models which gives a score between 0.0 and 1.0 of a certain event being present in the note. Given the scores, what is the best ...
rhn89's user avatar
  • 101
0 votes
0 answers
10 views

Appropriateness of the Universal Sentence Encoder model

I have a classification problem where the goal is to predict, based on a small paragraph, if an individual is British or not. The model used for the classification is Universal Sentence Encoder (to ...
Sara Mun's user avatar
0 votes
1 answer
33 views

Clustering of large text datasets with unknown number of clusters

I have a list of hotel names which may or may not be correct, and with different spellings (such as '&' instead of 'and'). I want to use clustering in order to group the hotels with different ...
user480840's user avatar
1 vote
0 answers
18 views

BERT eval loss increase while performance metrics also increase

I want to fine-tune BERT for Named Entity Recognition (NER). However, when fine-tuning over several epochs on different datasets I get a weird behaviour where the training loss decreases, eval loss ...
CodingSquirrel's user avatar
0 votes
0 answers
100 views

Locality sensitive hashing (LSH) with word embeddings and cosine similarity

I would like to ask about the methodology of LSH algorithm with Word Embeddings and Cosine Similarity to identify similar documents. First, I tokenize my sentences to create a list of tokens. Then, I ...
BDEngineer's user avatar
0 votes
0 answers
9 views

Problems in understanding Word2vec architectures

I have probably a very simple question, but I did not find any clear resource on the web. First let's consider the Skip-gram model, in which we try to predict a context word given the target word. In ...
user405969's user avatar
2 votes
1 answer
141 views

If a document set is too small for running a topic model, can you simply multiply the document set by a factor of 10 to be able to run the model?

Say I'm using Top2Vec as a topic model to capture the top 10 salient topics across documents. I have an array that contains the documents of the corpus. Initially, there are not enough documents to ...
NominalSystems's user avatar
0 votes
0 answers
73 views

How is the unigram tokenization using EM algorithm?

I intuitively understand what is happening in the unigram tokenizer and I think I also understand the EM algorithm if I can figure out the formulation in which I understand it i.e. What is the latent ...
figs_and_nuts's user avatar

15 30 50 per page
1
2 3 4 5
77