+++ title = "A Musician’s Journey" description = "An experiment in generated music in the form of a sound collage" date = 2023-05-27 [taxonomies] categories = ["Electronic", "AI", "Sound Collage", "Work in Progress"] [extra] content_class = "musicians-journey" +++ I wanted to try out [Bark](https://github.com/suno-ai/bark), a transformer-based text-to-audio model by Suno. I had some unused lyrics lying around, so I created a python script that fed it, line for line, to Bark. The script takes a [language](https://github.com/suno-ai/bark#supported-languages) and a [voice number](https://suno-ai.notion.site/8b8e8749ed514b0cbf3f699013548683?v=bc67cff786b04b50b3ceb756fd05f68c) as arguments, and an optional folder name. The generated file gets saved to a folder named for the verse and line (e.g. `verse1_line4`), and the filename indicates the language and voice used. When generating the audio, I surrounded each line with `♪`, which has a special meaning to Bark. It tells it to generate the prompt as music. This sometimes leads to the line being "sung", often poorly, and sometimes generates a full piece of realistic music. Sometimes it’s something in between. After generating **a bunch** of files for English, Korean, Turkish, Spanish etc (Bark usually generates the audio in broken english if the prompt is in english and the language is not), I went through all the folders, picked the "samples" I liked the best, and imported them into Reaper. I set an arbitrary limit of using five samples per line of text, and then I started layering. Most samples are used basically as is (apart from volume, splits and start/end point) but a tiny amount were slightly time stretched or pitched. I added a little reverb to each of the ten resulting tracks, picking presets pseudo-randomly. The panning for each track is also pseudo-random (a little to the left, a little to the right, a little to the left etc.) ## The script Below is the Python script I used to generate the files. It’s pretty specific to this task, but it would be trivial to make it more general (adding the possibility to supply a text file argument for example). ```python import sys import os os.environ["SUNO_ENABLE_MPS"] = "True" # Needed? import random import torch from bark import SAMPLE_RATE, generate_audio, preload_models from scipy.io.wavfile import write as write_wav torch.device("mps") # Defaults lang = "en" voice = "6" base_folder = "blacksheep" history_prompt="v2/{}_speaker_{}".format(lang, voice) # Each verse is a string in this list. Each line is separated with a newline character. text_prompts = [ "Baa, baa black sheep \nHave you any wool \nYes sir, yes sir \nThree bags full", "One for my master \nAnd one for my dame \nAnd one for the little boy \nWho lives down the lane", ] expressions = [ "[laughter]", "[laughs]", "[sighs]", "[music]", "[gasps]", "[clears throat]", ] verse_number = 0 # Perhaps this should be optional? if len(sys.argv) < 3: print("Please supply language and voice arguments. Folder name is optional.") quit() else: lang = sys.argv[1] voice = sys.argv[2] # Base folder is optional if len(sys.argv) == 4: base_folder = sys.argv[3] if not os.path.exists(base_folder): os.makedirs(base_folder) print("Hang on, preloading models...") preload_models() for verse in text_prompts: verse_number += 1 line_number = 0 for line in verse.splitlines(): line_number += 1 # Add a little extra expression sometimes expression = "" if random.random() > 0.8: expression = random.choice(expressions) # Generate the audio audio_array = generate_audio("{} ♪ {} ♪".format(expression, line)) # Set folder name and create it if it doesn’t exist folder_name = "verse{}_line{}".format(verse_number, line_number) full_path = "{}/{}".format(base_folder, folder_name) if not os.path.exists(full_path): os.makedirs(full_path) # Save to disk filename = "{}/voice-{}{}_verse-{}_line-{}.wav".format( full_path, lang, voice, verse_number, line_number ) write_wav(filename, SAMPLE_RATE, audio_array) ``` ## The lyrics These are the stupid lyrics ```txt He started out playing bass for The Foregone Conclusions But he left in a week, that was a foregone conclusion And he switched to playing piano with The Four Bar Blueses, Stuck behind a piano playing four bar blueses So he learned the guitar and he joined The Riffs Happy getting solos but annoyed with the riffs Played percussion over summer for The Vibraslaps But his hands got kinda caloused from those vibra slaps Back to the bass in a band called The Bluff Swore he could slap but the band called the bluff Was a backup singer for The Break-ups But he pulled up his stakes before the inevitable break-up Was a manager a while for The Snowflakes But he rage quit – Saying “Y’all a bunch of snowflakes” Then he toured for a while with The Elderly Statesmen But they snored when they slept like som elderly statesmen Played the sax for a while in The Phenomenal Mess But he quit because their style was a phenomenal mess Played a little tambo for The Very Legits But the pay wasn’t shit it wasn’t very legit So he played the ocarina with Sha-ronne and the PJs Got fired when he eyeballed Sha-ronne in her PJs Finally he started a small woodwind combo Just him and his sax was a would-win combo ``` ## The result The resulting audio can be downloaded from the link below, or you can use the `:play` command to play it directly in your browser. [A Musician’s Journey 1.1](https://files.mefirst.se/mp3/spitlo_-_a-musicians-journey-1.1.mp3) (05:19, 10,2 MB) ## Changelog - v1.1 Decrease volume on intro - v1.0 Initial release