Neuro-Symbolic Creative Artificial Intelligence for Humor

  • Author: Thomas Winters
  • Publication Date: 2023-12
  • Publication Venue: PhD thesis
  • Abstract: Large transformer-based language models, e.g. BERT and GPT-3, outperform previous architectures on most natural language processing tasks. Such language models are first pre-trained on gigantic corpora of text and later used as base-model for finetuning on a particular task. Since the pre-training step is usually not repeated, base models are not up-to-date with the latest information. In this paper, we update RobBERT, a RoBERTa-based state-of-the-art Dutch language model, which was trained in 2019. First, the tokenizer of RobBERT is updated to include new high-frequent tokens present in the latest Dutch OSCAR corpus, e.g. corona-related words. Then we further pre-train the RobBERT model using this dataset. To evaluate if our new model is a plug-in replacement for RobBERT, we introduce two additional criteria based on concept drift of existing tokens and alignment for novel tokens.We found that for certain language tasks this update results in a significant performance increase. These results highlight the benefit of continually updating a language model to account for evolving language use.
Read paper

Citation

APA

Winters, T. (2023). Neuro-Symbolic Creative Artificial Intelligence for Humor [Phdthesis]. KU Leuven.

Harvard

Winters, T. (2023) Neuro-Symbolic Creative Artificial Intelligence for Humor. phdthesis. KU Leuven.

Vancouver

1.
Winters T. Neuro-Symbolic Creative Artificial Intelligence for Humor [phdthesis]. KU Leuven; 2023.

BibTeX

Related talks

Related projects

2014
Solo project

Babbly

A programming language for efficiently building complex text generators

2021
contributor

DeepStochLog

Neural Stochastic Logic Programming

2020
Solo project

GITTA

Discovering textual structures to create generators

2020
Project collaborator

RobBERT

The state-of-the-art Dutch language model

2020
Project lead

RobBERT Humor Detection

Distinguishing jokes from generated non-jokes, creating the first Dutch humor detectors

2018
Project lead

Talk Generator

Automatic slideshow generation about any topic

2016
Solo project

TorfsBot

Twitterbot automatically imitating Rik Torfs, the previous rector of KU Leuven

2022
Solo project

TorfsBot Or Not?

Is the tweet from Rik Torfs or TorfsBot? A daily Twitter Turing test

2015
Solo project

Twitterbots

Various little Twitterbots I made over the years

Back to all publications