The writing of Hunter S. Thompson and the visual art of Ralph Steadman were a great match. No doubt about it. Unfortunately not all the writers have a talented friend to illustrate their books. And even if such a friend do exist, he may not be willing to illustrate a weekday night poetry reading.
Could the AI be willing to do this job? Even with awful poetry? Letting you ask for specific Ralph Steadman, Picasso or Quinquela Martin styles?
HardwareFor this project a Raspberry Pi 3 is used since it is able to record voices with a USB mic, runs OpenAI Python API and it has HDMI output.
A DFRobot 16 positions encoder is also used to select the pictorial style and a 1000 lumens Led Projector is in charge of projecting the illustrations/paintings. Then, some jumper cables, a small 3d printed part for the Rpi enclosure and a 5v 3A power supply and we are all set.
ListeningAfter installing Raspberry Pi OS Desktop and connecting the USB mic, the following command is used to learn the device ID required for the recording.
$ arecord -l
To make sure the ID is correct, the following command can be used, replacing X by the previously obtained value:
$ arecord -D plughw:X,0 --duration=3 --rate 44100 test.wav
The same command will be usedwith a os.system call from Python and then a Speech Recognition is applied to the wav file. There are other Speech Recognition options as well, but this one works perfect, it is multilanguage and.wav files are stored for logging purposes.
# define the recognizer
r = sr.Recognizer()
# define the audio file
audio_file = sr.AudioFile('pwdata/'+filename+'.wav')
if debugMode==1:
print("File recorded...")
# speech recognition
with audio_file as source:
r.adjust_for_ambient_noise(source)
audio = r.record(source)
try:
result = r.recognize_google(audio,language="en")
except:
result=""
At this point, the text is extracted from the recording and it is ready for Dall-E2 but sending all the words of a poem as an instruction to create an AI image will not provide satisfactory results.
What could be done? The same OpenAI Python API is able to use ChatGPT and ChatGPT is able to create an excerpt for the poem, much more suitable for the illustration.
Registration is free for OpenAI API at https://platform.openai.com
After registration, the API key must inserted in Python API calls. Below is an example of the excerpt request.
excerptRequestPrefix=”Write a 5 words excerpt of: ”
poem="I made myself a snowball As perfect as could be. I thought I'd keep it as a pet And let it sleep with me. I made it some pajamas And a pillow for its head. Then last night it ran away, But first it wet the bed."
completion = openai.Completion.create(
engine=model_engine,
prompt=excerptRequestPrefix+poem,
max_tokens=1024,
n=1,
stop=None,
temperature=0.5,
)
excerpt=completion.choices[0].text
Image generation through Dall-E2 APIDall-E2 is an AI system that can create realistic images and art from a description in natural language. How complicated is to create an image? Not at all:
res = img_gen(excerpt)
img_data = requests.get(res).content
Regarding API costs. OpenAI API is not free but a free trial $18 grant is provided for new accounts. With $18 you can generate 1000 512x512px images, more than enough for several poetry readings.
The same grant is in fact used for the excerpts with ChatGPT. For ChatGPT, the fee depends on how many tokens are used for the request.
What else is coded?
The main Python program is in charge of several things like generating unique file names, recording wav files, applying Speech Recognition, creating the excerpt, reading an encoder and sending the excerpt to Dall-E2 API along with painting style (Abstract, Pop Art, Ralph Steadman, Quinquela Martin, etc)
The image provided by the API is in fact a URL valid for one hour so, for logging purposes, the image is converted and stored. Then projected using the OpenCV library.
Potential errors like unrecognized text in the recording or Dall-E2 API not returning images due to policy violations are also handled.
According to the encoder, these pictorial styles are being used.
styleToReturn=”Dibujo en el estilo de M. C. Escher”
styleToReturn=”Grabado de Katsushika Hokusai”
styleToReturn=”Pintura de arte abstracto de Picasso”
styleToReturn=”Pintura en el estilo de Quinquela Martin”
styleToReturn=”Pintura en el estilo de Antonio Berni “
styleToReturn=”pintura en el estilo de Salvador Dalí”
styleToReturn=”Dibujo en el estilo de Ralph Steadman”
styleToReturn=”Dibujo en el estilo de Leonardo Da Vinci”
styleToReturn=”Pintura al oleo”
styleToReturn=”Pintura puntillista”
styleToReturn=”Pintura realista”
styleToReturn=”Pintura impresionista”
styleToReturn=”Pintura abstracta”
styleToReturn=”Pintura surrealista”
styleToReturn=”Arte pop”
styleToReturn=”Pintura digital cyberpunk”
System settingsThere are several settings inside the Python code.
# OpenAIapi key
openai.api_key=""
# the engine used for ChatGPT
model_engine = "text-davinci-003"
# language for Speech Recognition
audioLanguage="Spanish"
# recording seconds
recordingTime=30
# image display seconds
imageSeconds=15
# loop delay
loopSeconds=1
#verbose mode
debugMode=1
# id for the USB mic
usbMicCard=1
# excerpt mode
excerptEnabled=1
# GPIO pins for the encoder
code1Pin=5
code2Pin=6
code3Pin=13
code4Pin=19
# how creative the excerpt could be
chatGPTTemperature=0.5
# how much will you spend to obtain the answer
chatGPTMaxTokens=1024
DemoHere are some demos with poetry and with prose. In fact there is no limitation about the kind of text to be supplied. You can turn on the Poetry Wall in your diner table and illustrations will appear according to the conversations. You can place the Poetry Wall next to the TV breaking news and fun is guaranteed.
Demo reading poetryDemo reading proseIllustration examples
- The machine to think of Gladys (Mario Levrero)
- Nyctograph machine (Lewis Carrol)
- The Klausner Machine (Roald Dahl)
- Reading machine (Raymond Roussel)
- The Argentinian rain making machine
- Hunter S. Thompson ASCII art installation
- Haiku reader
- Literature machines
Comments