Skip to main content

Routing text-to-speech on the mac

There are any number of ways of working with the built-in text-to-speech synthesis capabilities on the mac. All of the music programming languages I use - Max, Pd and SuperCollider - offer ways of doing this, and I've also had great success with controlling the output using AppleScript. The problem is that in every case, the audio itself is actually synthesised by the mac os itself, which means it is not accessible within an audio environment for further processing.

I was inspired to have another think about this recently by a thread on the SuperCollider list where somebody was trying to do exactly this, by using Jack to route the sound from the mac back into the application for further processing. What I've started to experiment with is routing the audio into a different application: in the example above, controlling the speech synthesis in Max and passing the audio into Pd. Combined with the facility to pass midi from Max to Pd (easy), I think I can see how I can make a workable and potentially interesting system. But, for now, just proving to myself that it can be done :)

Yet more text-to-screech

There's quite a history of musicians and sound artists doing creative things with speech synthesis. One of the best known examples is the Radiohead track Fitter Happier from the album OK Computer, and its not hard to find other cases of commercial artists incorporating this kind of material in tracks.

Very often this has been done on the mac, which has always had speech synthesis built in. (There's a very interesting anecdote about how speech synthesis came to be included on the very first macs at the personal insistence of Steve Jobs.) A number of years ago - I can't find the links now - there was a small community of composers who were authoring and releasing 'tracks' which consisted of nothing but SimpleText files, which were to be 'played back' using the speech synthesis facility. This kind of thing was more effective back then: the earlier versions of the mac speech system responded in interesting and unpredictable ways to abberrant texts.

I've often used this kind of thing in my own work, and I've coined my own term for it: 'text-to-screech'. Here's an example, this is a track called 'vifyavif wif yavif-oo', which also forms part of the instrumental piece donkerstraat:

I've now started work on another such project. This will be a performance piece, where I will be typing text live: I've done work along these lines before, but the new twist will be to try to find away to add extra processing to the speech synthesis live, including perhaps sampling and looping. There are some technical problems with doing this on the mac, however… which I'll make the subject of another post.

The Sloans Project

I saw a great new opera recently, The Sloans Project. Composed by Gareth Williams with a libretto by David Brook, it was set and performed in the historic Sloans Bar and Restaurant. Yes, that'll be opera performed in a pub! The opening scene was a coup de theatre. As the audience milled about in the bar downstairs, the show just started right there, with a couple at the bar bursting into song, soon to be answered by another drunken-looking guy at the bar. After that the audience were invited to process to some of the upstairs rooms, where there were a series of three vignettes, followed by a culminating scene in the ballroom.

Gareth of course is a friend and colleage of mine of old, with his PhD at the RSAMD – sorry, the Royal Conservatoire of Scotland – running more or less in parallel to mine. Recently he's been plowing the operatic furrow consistently and with great success. His musical language is very spare and secure, with a great command of vocal writing. In this piece I was drawn in by the unique staging as much as anything else, but I seemed to detect some new thinking in his approach, particularly in scene two Chopin's Ghosts, which collided separate and uncoordinated music in different keys on the harp and piano in a very creative and effective way.

parkbenchpound

Mags and I were enjoying a pleasant evening walk in Maxwell Park, when she had a lucky find: a pound coin sitting on the grass beside a metal park bench. We had some fun sitting in the sun and improvising with the sound of the bench and the coin, which I recorded on my new HTC Wildfire S Android phone. I made the track in Logic using nothing but those sounds: fairly minimal effects, just some pitch shifting and fake stereo. Sort of an urban gamelan thing…

Music and intellectual play

Next year at the Royal Conservatoire of Scotland I'm going to be sharing the teaching of the Teaching Musics of the World module with my colleage Barnaby Brown. I was doing some background reading, having a look at Bonnie C. Wade's 'Thinking Musically', which is one of the overview volumes in OUPs 'Global Music Series'. This is just an undergrad textbook, but I did stumble upon on a notion which set me thinking, where she discusses 'intellectual play':

Intellectual play as well as aesthetic choice are the major factors in an improvisatory South Indian performance practice called rāgamālikā ("garland of ragas"). Rāgamālikā comprises a progression from one rāga to another, each sufficiently similar that one must listen closely to detect the shift, but sufficiently different that contrast has been achieved. The intellectual play is enhanced when, in vocal rāgamālikā, the names of the new modes are introduced into the text, embedded in clever ways. (p126)

As ever, I find comparisons with the practices of other musical cultures illuminating with regard to my own work. There are so many ideas in my pieces: things which are there primarily for intellectual reasons. Which sounds bad and wrong, doesn't it! Describing music as intellectual is a criticism, isn't it? Especially if I were to self-describe my music as intellectual?

Liebesglück hat tausend Zungen

This Tuesday 3rd May at 1600 sees the performance of the only piece I have in this year's Plug festival at the RSAMD in Glasgow, Liebesglück hat tausend Zungen – a lied, for soprano and piano. Now, why on earth, you say would anyone in this day and age want to write a lied of all things?! Good question: I'm not sure I know the answer. However, the fact is that the unifying theme of this year's festival is deemed to be something called the Glasgow Liederbuch, to which all the composer have been invited to contribute. Which means, two and several bit concerts devoted to new lieder. Written within rather strict guidelines, I have to say, voice and piano only, no electronics, German poetic text from the era… basically, we're not allowed to do anything which Schubert didn't do. (So, eg dying of syphillis and not finishing symphonies is ok, playing inside the piano is not.)

There's a .pdf file of the score if anyone is particularly curious to see in what way I've tackled this rather odd commission. Have I done anything Schubert wouldn't have done? I think so, I think so :)

Max 5 and SuperCollider using sc3~

Another way of doing it, using the sc3~ object. I can't quite make up my mind at the moment which is the way forward; this way it looks like you'd have to develop your code in SuperCollider, then paste it into Max and hope it still works… have a feeling the OSC bridging method is more generally useful, for instance, could use it with Pd as well. On the other hand, this way it's all together in the one patch in a single program, to run it you don't need SC installed, probably a lot easier to deal with when you come back to it in five years time…

Max 5 to SuperCollider using OSC

Updated

I guess it's just possible some people might not get what's going on here :) SuperCollider is a very powerful text-based programming language for sound. If you know what you're doing, then with just a couple of lines of code you can create really fascinating sounds and textures, even entire compositions; see for instance sc140, an album where each piece is created using just a twitter-long 140 characters of code.

Unfortunately, I don't know what I'm doing; my mind doesn't seem to work logically enough to really do computer programming properly! Enter the other half of the equation, Max 5 (or Max/MSP as it used to be known). This is a graphical programming language, which allows you to make stuff happen just by plugging things together on the screen.

I've got quite good at Max, and sometimes managed to make quite interesting sounds in SuperCollider.  I found the missing link between these two on a great blog posting by Fredrik Olofsson. It's a way of using a so-called 'quark', a type of extension to SuperCollider, to send OSC (Open Sound Control) messages from Max to SC. So, what I'm happy to have found here is an easy way to attach knobs to these hard-to-get-at text based sounds. The next step from here will be controlling SC using external midi/bluetooth/whatever hardware, again via Max.

FIMPaC day 1

So here I am at the Forum for Innovation in Music Production and Composition at Leeds College of Music. It's been a while since I've attended a conference, but I'm getting back into the swing of it. It's hardly coalmining; nevertheless, it's quite tiring to sit still all day listening to a long series of what can be quite dense and complicated presentations.

The subject matter for this conference at least is consistently up my street. Here's a quick outline of the things people have been talking about, these are not the paper titles, but rather my quick summaries;

  • Julian Brook on the role of and status of the person operating the mixing desk in ea music
  • Martin Blain on the MMULe laptop ensemble in Manchester (must find out about an article by Michael Kirkby which he referred to, on acting and non-acting?)
  • Adam Stansbie building up and then comprehensively knocking down some dubious philosophical ideas (Godlovitch) which have been proposed around performance in ea music
  • some, um, perhaps not entirely convincing comparisons between so-called 'avant-rock' and 'experimental' musics from Chris Ruffoni
  • an entertaining and difficult paper from Robert Wilsmore on music sampling, full of clever postmodern confusions, most notably his so called 'Wilsmore Symphony No 2', where he proposed the thought experiment of taking Beethoven's Symphony No 2, scoring out Beethoven's name and putting his on it instead
  • Jon Aveyard comparing practices in binaural audio to the cinematic notion of the 'point of view' (interesting, ideas for a piece in there, also some nice demo's of different kinds of POV shot from Goodfellas and The Lady in the Lake (1947)
  • Robert Ratcliffe showing some of his complex and brilliant mashups of, like, Aphex Twin and Berio?!? I now remember meeting Robert at a previous concert, his work is really, really great, clear, entertaining, naughty
  • Mark Marrington giving a thoughtful survey of the state of the modern digital audio workstation, and how it informs the work of his composing students
  • Rob Godman talking about live-ness and stage presence, with reference to his piece Duel for piano and sound projection

Oh, yes and we had Jazzie B this morning, for, well, a keynote speech, but really mostly a question and answer session about his wide range of experiences as a music producer. Also met Frank Millward, who it turns out is doing a project in Glasgow at the moment which sounds right up my street, looking forward to hooking up with him again. Also caught up with Jane Anthony; I did a piece a few years ago for her Leeds Lieder+ festival, talking about me coming down for a talk, maybe even doing the whole song cycle down there. Also met a couple of my ex-students. Also, just had a great bowl of satay noodles. Enough for one day, I think.