Mopjesbot (Dutch for "little jokes bot") is a Twitterbot that automatically creates puns about the news. The bot creates variations on the popular Dutch joke format called Kermit The Sticker:
Het is groen en het plakt? Kermit de Sticker!
Every day, it picks a prominent person from the news and creates a set of five puns about this person. For example, it could tweet about Maggie De Block:
Het is een Belgisch politica en bepaalt een dierenverblijf?
Maggie De Hok!
Het is een Belgisch politica en komt tot net boven de enkel?
Maggie De Sok!
Het is een Belgisch politica en wordt om de taille gedragen?
Maggie De Rok!
Het is een Belgisch politica en is een zweertje?
Maggie De Pok!
Het is een Belgisch politica en vertelt een relatief onschuldige leugen?
Maggie De Jok!
Het is een Belgisch politica en bereidt voedsel tot een maaltijd?
Maggie De Kok!
Or about Koen Geens:
Het is een Belgisch politicus en is edelsteen?
— MopjesBot (@MopjesBot) February 21, 2020
Koen Steens!
Het is een Belgisch politicus en is fopspeen?
— MopjesBot (@MopjesBot) February 21, 2020
Koen Speens!
Het is een Belgisch politicus en is enig?
— MopjesBot (@MopjesBot) February 21, 2020
Koen Eens!
Het is een Belgisch politicus en heeft de bedoeling?
— MopjesBot (@MopjesBot) February 21, 2020
Koen Meens!
Het is een Belgisch politicus en is wilg?
— MopjesBot (@MopjesBot) February 21, 2020
Koen Weens!
Algorithm
MopjesBot uses Wiktionary and a thesaurus to find ways to describe words, Wikipedia for finding who famous people are and a rhyming dictionary to find proper rhymes for the puns. It also scrapes names from the news to find the appropriate butt of the jokes.
For example: Given the name Kanye West as input, it would perform the following steps:
Find a rhyme on the last word with the same number of syllables, e.g. rest. If the last word of the input has multiple syllables, look for rhymes on any combination of consequent syllables. Prefer more common words using a word frequency list if this is available in the language of choice.
Replace the relevant syllables of the input name with the rhyme word, e.g. Kanye Rest.
Use Wikipedia to find a nice description of the entity with the input name. This is not that hard to extract from the Wikipedia page since the introduction usually starts with [entity_name] is/was/are [explanation]. By taking the part after the "to be" verb and until any punctuation or start of the clause, the program can distill a brief description. For example, it would describe Kanye West as an American rapper.
It now only has to describe the rhyme word to complete the pun riddle. To achieve this, either a thesaurus (for short descriptions) or segments from Wiktionary (for longer, more interesting descriptions) could be used. The algorithm should, however, make sure that the description does not contain the word to guess itself since that would spoil the fun. The word frequency table could also be used to choose less common (and thus more specific) descriptive words. For example, it could describe rest as "relief from work".
Now it can fill all these words into the template, to create the following joke:
It's an American rapper and is relief from work?
Kanye Rest
These jokes tend to become more interesting once you have multiple, as they turn into a fun guessing game. Luckily, given this completely automated generation process, the described program can easily generate many related jokes.
The complete overview of all steps MopjesBot follows to generate these jokes are summarized in the diagram below:
This system is thus able to generate jokes following a specific template and schema, but also nudges the jokes to have a higher probability of having certain characteristics (e.g., common or less common words in certain template slots).
You can read more about the inner workings of this algorithm on our research group blog.