SocialCompUW
/

youtube-covid-misinfo-detect

Text Classification

Inference Endpoints

Model card Files Files and versions Community

SocialCompUW commited on Aug 26, 2024

Commit

1ca810a

·

verified ·

1 Parent(s): 2bcc24a

Update README.md

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -37,10 +37,13 @@ The dataset was split 80-10-10 across the train (N=2180), validation (N=272), an
 To get started, you should initialize the model using AutoTokenizer and AutoModelForSequenceClassification classes. For the tokenizer, set "use_fast" parameter to False, the max_len to 1024, padding to "max_length," and truncation to True. For the model, set the "num_labels" parameter to 3.
-Next, with a YouTube video dataset with metadata, please concatenate each video's title, description, transcripts, and tags in the following manner:
 input = 'VIDEO TITLE: ' + title + '\nVIDEO DESCRIPTION: ' + description + '\nVIDEO TRANSCRIPT: ' + transcript + '\nVIDEO TAGS: ' + tags
-Thus, each video in your dataset should have its input metadata formatted in the structure above. Finally, run the input into a tokenizer and feed the tokenized input into the model to obtain one of three predicted labels. Use the logit function to obtain the label: _, pred_idx = outputs.logits.max(dim=1)
 ## Training Data

 To get started, you should initialize the model using AutoTokenizer and AutoModelForSequenceClassification classes. For the tokenizer, set "use_fast" parameter to False, the max_len to 1024, padding to "max_length," and truncation to True. For the model, set the "num_labels" parameter to 3.
+Next, with a YouTube video dataset with metadata, please concatenate each video's title, description, transcripts, and tags in the following manner:
 input = 'VIDEO TITLE: ' + title + '\nVIDEO DESCRIPTION: ' + description + '\nVIDEO TRANSCRIPT: ' + transcript + '\nVIDEO TAGS: ' + tags
+Thus, each video in your dataset should have its input metadata formatted in the structure above. Finally, run the input into a tokenizer and feed the tokenized input into the model to obtain one of three predicted labels. Use the logit function to obtain the label:
+_, pred_idx = outputs.logits.max(dim=1)
 ## Training Data