Skip to main content

Routing text-to-speech on the mac

There are any number of ways of working with the built-in text-to-speech synthesis capabilities on the mac. All of the music programming languages I use - Max, Pd and SuperCollider - offer ways of doing this, and I've also had great success with controlling the output using AppleScript. The problem is that in every case, the audio itself is actually synthesised by the mac os itself, which means it is not accessible within an audio environment for further processing.

I was inspired to have another think about this recently by a thread on the SuperCollider list where somebody was trying to do exactly this, by using Jack to route the sound from the mac back into the application for further processing. What I've started to experiment with is routing the audio into a different application: in the example above, controlling the speech synthesis in Max and passing the audio into Pd. Combined with the facility to pass midi from Max to Pd (easy), I think I can see how I can make a workable and potentially interesting system. But, for now, just proving to myself that it can be done :)

Yet more text-to-screech

There's quite a history of musicians and sound artists doing creative things with speech synthesis. One of the best known examples is the Radiohead track Fitter Happier from the album OK Computer, and its not hard to find other cases of commercial artists incorporating this kind of material in tracks.

Very often this has been done on the mac, which has always had speech synthesis built in. (There's a very interesting anecdote about how speech synthesis came to be included on the very first macs at the personal insistence of Steve Jobs.) A number of years ago - I can't find the links now - there was a small community of composers who were authoring and releasing 'tracks' which consisted of nothing but SimpleText files, which were to be 'played back' using the speech synthesis facility. This kind of thing was more effective back then: the earlier versions of the mac speech system responded in interesting and unpredictable ways to abberrant texts.

I've often used this kind of thing in my own work, and I've coined my own term for it: 'text-to-screech'. Here's an example, this is a track called 'vifyavif wif yavif-oo', which also forms part of the instrumental piece donkerstraat:

I've now started work on another such project. This will be a performance piece, where I will be typing text live: I've done work along these lines before, but the new twist will be to try to find away to add extra processing to the speech synthesis live, including perhaps sampling and looping. There are some technical problems with doing this on the mac, however… which I'll make the subject of another post.

The Sloans Project

I saw a great new opera recently, The Sloans Project. Composed by Gareth Williams with a libretto by David Brook, it was set and performed in the historic Sloans Bar and Restaurant. Yes, that'll be opera performed in a pub! The opening scene was a coup de theatre. As the audience milled about in the bar downstairs, the show just started right there, with a couple at the bar bursting into song, soon to be answered by another drunken-looking guy at the bar. After that the audience were invited to process to some of the upstairs rooms, where there were a series of three vignettes, followed by a culminating scene in the ballroom.

Gareth of course is a friend and colleage of mine of old, with his PhD at the RSAMD – sorry, the Royal Conservatoire of Scotland – running more or less in parallel to mine. Recently he's been plowing the operatic furrow consistently and with great success. His musical language is very spare and secure, with a great command of vocal writing. In this piece I was drawn in by the unique staging as much as anything else, but I seemed to detect some new thinking in his approach, particularly in scene two Chopin's Ghosts, which collided separate and uncoordinated music in different keys on the harp and piano in a very creative and effective way.

parkbenchpound

Mags and I were enjoying a pleasant evening walk in Maxwell Park, when she had a lucky find: a pound coin sitting on the grass beside a metal park bench. We had some fun sitting in the sun and improvising with the sound of the bench and the coin, which I recorded on my new HTC Wildfire S Android phone. I made the track in Logic using nothing but those sounds: fairly minimal effects, just some pitch shifting and fake stereo. Sort of an urban gamelan thing…

Music and intellectual play

Next year at the Royal Conservatoire of Scotland I'm going to be sharing the teaching of the Teaching Musics of the World module with my colleage Barnaby Brown. I was doing some background reading, having a look at Bonnie C. Wade's 'Thinking Musically', which is one of the overview volumes in OUPs 'Global Music Series'. This is just an undergrad textbook, but I did stumble upon on a notion which set me thinking, where she discusses 'intellectual play':

Intellectual play as well as aesthetic choice are the major factors in an improvisatory South Indian performance practice called rāgamālikā ("garland of ragas"). Rāgamālikā comprises a progression from one rāga to another, each sufficiently similar that one must listen closely to detect the shift, but sufficiently different that contrast has been achieved. The intellectual play is enhanced when, in vocal rāgamālikā, the names of the new modes are introduced into the text, embedded in clever ways. (p126)

As ever, I find comparisons with the practices of other musical cultures illuminating with regard to my own work. There are so many ideas in my pieces: things which are there primarily for intellectual reasons. Which sounds bad and wrong, doesn't it! Describing music as intellectual is a criticism, isn't it? Especially if I were to self-describe my music as intellectual?

Liebesglück hat tausend Zungen

This Tuesday 3rd May at 1600 sees the performance of the only piece I have in this year's Plug festival at the RSAMD in Glasgow, Liebesglück hat tausend Zungen – a lied, for soprano and piano. Now, why on earth, you say would anyone in this day and age want to write a lied of all things?! Good question: I'm not sure I know the answer. However, the fact is that the unifying theme of this year's festival is deemed to be something called the Glasgow Liederbuch, to which all the composer have been invited to contribute. Which means, two and several bit concerts devoted to new lieder. Written within rather strict guidelines, I have to say, voice and piano only, no electronics, German poetic text from the era… basically, we're not allowed to do anything which Schubert didn't do. (So, eg dying of syphillis and not finishing symphonies is ok, playing inside the piano is not.)

There's a .pdf file of the score if anyone is particularly curious to see in what way I've tackled this rather odd commission. Have I done anything Schubert wouldn't have done? I think so, I think so :)

Max 5 and SuperCollider using sc3~

Another way of doing it, using the sc3~ object. I can't quite make up my mind at the moment which is the way forward; this way it looks like you'd have to develop your code in SuperCollider, then paste it into Max and hope it still works… have a feeling the OSC bridging method is more generally useful, for instance, could use it with Pd as well. On the other hand, this way it's all together in the one patch in a single program, to run it you don't need SC installed, probably a lot easier to deal with when you come back to it in five years time…