"Sometimes, an actor, a music, and an international," writes a computer program, trying to describe a photo of me. "On the other hand, a fashion and a handsome: the fashion is known to some as anfertigen, and the handsome is also known as a pretty. Of course, the handsome evokes impressive." Aw, shucks, you flirty bot! You describe all the selfies as handsome, but I can still be flattered by your near-random word generation.
Submit your own photo – of anything you like, not necessarily of your handsome known as a pretty – to this clever coalition of complex computer programs and you will receive your own "lexograph," an artificial-intelligence-generated poem based on what is recognized in the picture. The project, word.camera, is the creation of an New York University graduate student named Ross Goodwin. Fans of computer programming and conceptual poetry have been allied on several important projects in the past; this new exercise gives them new reason to love each other.
This is not Goodwin's first arch art project: he has also done such things as making a "clock" out of all the texts on Project Gutenberg (his bot finds and highlights words that make a sentence that tells you what time it is). Such technological games are often clever, amusing and ephemeral – they are artistic one-liners that you see once, appreciate and move on from. They are one level of artistic sophistication above a joke.
But this particular writing machine is having a hit moment because it is creating unusually troubling and evocative texts. Thousands of people are submitting photos – mostly of themselves – to see what the machine will say about them.
Technically, it works like this: The image is first fed to image-recognition software called Clarifai that tries to identify objects on the photo and then writes a bunch of tags, mostly nouns. It is impressively accurate: It recognizes buildings, landscapes, and even small objects such as cameras. The list of things it comes up with are then fed into a language repository called ConceptNet. This is open-source software still in development; its creators call it "a semantic network containing lots of things computers should know about the world, especially when understanding text written by people."
This is where the weird stuff happens. It's funny how it's much easier for a computer to approximate vision than to approximate language. Language is truly hard to roboticize.
ConceptNet – which calls itself a "convolutional neural network," surely itself a randomly generated phrase – apparently takes the base words and then tries to improvise on them, linking them with associated concepts and stringing them together into sentences that are not only grammatically coherent but are also supposed to follow some kind of logical flow. They don't succeed, of course – but that's what makes them so poetic.
It creates sentences such as: "Of course, a portrait and a north america: the portrait remains a picture of individual, and the north america remains unknown." Or: "Beyond, a mist and a travel." Or: "Granted, a building, a street, and a sun." Or: "On the other hand, a studio and a room: the studio is not a workplace consisting of a room or building where movies or television shows or radio programs are produced and recorded, and the room is made from an area within a building enclosing by walls and floor and ceiling." That last reads like Robbe-Grillet.
The program's tactics become obvious after a while: it introduces sentences with linking phrases such as "of course," "granted," "undoubtedly," "to this end," "to sum up." Those are pretty much random, but add an air of pomposity. Then it takes nouns and breaks them down into their component or related parts: farm evokes animals, it might say, and animals evokes chicken. It throws in a lot of foreign words and even words in different alphabets. Sometimes it just seems like a game of word association. Still, it's slightly more independent than so many random-Web-stripping poetry generators.
One funny Web commenter noted that the stilted style ends up sounding like the slightly odd English of philosopher Slavoj Zizek. Once you have heard that, you cannot stop yourself from reading all of these texts – indeed, hearing them – in his distinctive strong Slovenian accent. It could be billed as a Zizek declamation generator.
Goodwin wrote on his blog, "I hope lexography eventually becomes accepted as a new form of photography. As a writer and a photographer, I love the idea that I could look at a scene and photograph it because it might generate an interesting poem or short story, rather than just an interesting image." Nothing new about describing images in words, of course, but having a machine do it automatically takes these descriptions into unforeseen directions and the unpredictable is always a useful element in art.
But there's something else that has made this little website so very popular: it's yet another form of self-portraiture. The first picture it encourages you to take is of yourself: If you look it up on your computer, it will ask you to enable your Web camera, which is of course pointed at your face. We already love programs that alter our selfies to make cool avatars for ourselves; there are dozens of them, but they are all visual. This is the linguistic avatar-maker. It gives us another stylized self-representation.
And it feeds our fascination with artificial intelligence (currently reflected in hit movies such as Her and Ex Machina). The computer's description of my face gives me the uncanny feeling that it can actually see me. It's like looking out from inside the machine.