Neuro-Symbolic Creative Artificial Intelligence for Humor
- Author: Thomas Winters
- Publication Date: 2023-12
- Publication Venue: PhD thesis
- Abstract: Large transformer-based language models, e.g. BERT and GPT-3, outperform previous architectures on most natural language processing tasks. Such language models are first pre-trained on gigantic corpora of text and later used as base-model for finetuning on a particular task. Since the pre-training step is usually not repeated, base models are not up-to-date with the latest information. In this paper, we update RobBERT, a RoBERTa-based state-of-the-art Dutch language model, which was trained in 2019. First, the tokenizer of RobBERT is updated to include new high-frequent tokens present in the latest Dutch OSCAR corpus, e.g. corona-related words. Then we further pre-train the RobBERT model using this dataset. To evaluate if our new model is a plug-in replacement for RobBERT, we introduce two additional criteria based on concept drift of existing tokens and alignment for novel tokens.We found that for certain language tasks this update results in a significant performance increase. These results highlight the benefit of continually updating a language model to account for evolving language use.
Citation
APA
Winters, T. (2023). Neuro-Symbolic Creative Artificial Intelligence for Humor [Phdthesis]. KU Leuven.
Harvard
Winters, T. (2023) Neuro-Symbolic Creative Artificial Intelligence for Humor. phdthesis. KU Leuven.
Vancouver
1.
Winters T. Neuro-Symbolic Creative Artificial Intelligence for Humor [phdthesis]. KU Leuven; 2023.
BibTeX
Related talks
Related projects
![](/static/c2c62ad614872634c3e5b9e86d6b2b44/3704f/cover.png)
Babbly
A programming language for efficiently building complex text generators
![](/static/07badb80c363e4e48670f6fc05e16293/aab36/cover.png)
DeepStochLog
Neural Stochastic Logic Programming
![](/static/f80afac5fd4dd4f2b67fd1ffe82d9a71/680de/cover.png)
GITTA
Discovering textual structures to create generators
![](/static/23aecdae0bb05178a34c9cc7d333ed78/ac9fb/cover.png)
RobBERT
The state-of-the-art Dutch language model
![](/static/4196985f6085f1f028c01aa0673abcd0/680de/cover.png)
RobBERT Humor Detection
Distinguishing jokes from generated non-jokes, creating the first Dutch humor detectors
![](/static/af556600246ea3742c8d162e22b9fc6f/6608c/cover.png)
Talk Generator
Automatic slideshow generation about any topic
![](/static/8be311c6f34294d1ac5f2cce51b0f8b5/0e988/cover.png)
TorfsBot
Twitterbot automatically imitating Rik Torfs, the previous rector of KU Leuven
![](/static/8ba959b042306b08b3567820b583fe4c/7d110/cover.png)
TorfsBot Or Not?
Is the tweet from Rik Torfs or TorfsBot? A daily Twitter Turing test
![](/static/0e45437b7292e910268f520b0f910a33/fe7dd/cover.png)
Twitterbots
Various little Twitterbots I made over the years