Discovering Books: Booklamp vs. GoodReads
Here’s a brief description:
“Our program breaks a book up into 100 scenes and measures the ‘DNA’ of each scene, looking for 132 different thematic ingredients, and another 2,000 variables.” A reader can go to the BookLamp site, which was launched in beta last week, and do a keyword search for titles that meet the criteria similar to a title they plug into the site. Pundits have dubbed it the “Pandora for Books,” though Stanton prefers the term “Book Genome Project.”
“Say you’re looking for a novel like the The Da Vinci Code. We have found that it contains 18.6% Religion and Religious Institutions, 9.4% Police & Murder Investigation, 8.2% Art and Art Galleries, and 6.7% Secret Societies & Communities, and other elements — we’ll pull out a book with similar elements, provided it is in our database,” says Stanton. [. . .]
But can a computer really accurately assess the content of a book? Stanton thinks so. “Our original models are based on focus groups,” he says. “We would give them a highly dense scene and a low density scene, for example, and ask them to assess them, which gave us a basis for training the models. Then we looked at books that might exceed the models and tweaked the formulas. In this way, our algorithms are trained like a human being.”
BookLamp quantifies such elements as density, pacing, description, dialogue and motion, in addition to numerous nuanced micro-categories, such as “pistols/rifles/weapons” or “explicit depictions of intimacy” or “office environments.”
I’m totally a sucker for this sort of shit . . . I think it’s great that people are finally thinking about how readers find books; I think it’s maybe detrimental to only read books that fit your fiction prejudices. (Of course, what example does the founder of Booklamp use in the interview? The fucking Da Vinci Code. Dear god, please make it stop.)
Knowledgeable recommendations used to be the function of booksellers, but since we, as a culture, seem not to need them (or bookstores) at all anymore, there are a number of book sites popping up to fill this void.
I’m sure this is over-simplifying, but there seems to be two major approaches to automated recommendations: the “similar user” approach, which is what Last.fm and GoodReads use and is based on the idea that if you like A, B, & C, and a lot of people who also like A, B, & C, also like Q, then you’ll probably like Q as well; and the “similar component” approach, which is what’s in play with Booklamp and Pandora and uses top down analysis to recommend books/music with components similar to books/music you like.
Personally, I prefer the first approach, and have never really gotten Pandora, nor do I see how my favorite books can be accurately quantified (The Sound and the Fury is 25% suicidal tendencies and 25% narrated by a mentally challenged character? Or The Crying of Lot 49 is 48% paranoia and 88% too cool for school?)
Now that two of the three recommendation sites are at least in their beta phases—Bookish is still in the works—it seems sort of worthwhile to check and see what these sites recommend . . . It’s one thing to talk about the theory and drool over hot catchphrases like “discovery” and “genome,” but another to find out that no matter what you put in, you’re told that you should read Twilight.
I’m not sure I can convince you of this, but I’m doing this test live . . . I haven’t looked up any of these books yet, so I have no idea what I’ll find. (It’s like live blogging! Which I believe is now called “tweeting.” Anyway.)
So, first up, the book I have tattooed on my arm—The Crying of Lot 49.
Booklamp: No results. Well that’s unfortunate . . . Skipped right over the paragraph in the Publishing Perspectives article about how they currently only have 20,000 Random House books in their database . . .
GoodReads: The Recognitions by William Gaddis, The Sot-Weed Factor by John Barth, Dog Soldiers by Robert Stone, Falconer by John Cheever, Call It Sleep by Henry Roth.
OK, not so sure about the last couple, but the GoodReads recommendations are fine. Gaddis rocks, but I love JR more than The Recognitions, and Sot-Weed isn’t even close to my favorite Barth book. Still, not bad. Predictable, but not bad.
Since Booklamp is so limited in scope, I’m using their “Author Browse” function to find a good example . . . And oh, look, they have a listing for The Sound and the Fury! My snotty prediction about what the make up would be was pretty crap . . . Instead of confused narrators and philosophical issues about time, the five most prominent “StoryDNA” elements according to Booklamp are: Financial Matters, Family Connections, Domestic Environments, Automobiles & Vehicles, and Nature/Forests/Trees. The fuck?? Seriously? This does not bode well . . .
Booklamp: Sanctuary by William Faulkner, Moon Women by Pamela Duncan, Leo and the Lesser Lion by Sandra Forrester, Good-bye Marianne by Irene Watts, and Telling Lies to Alice by Laura Wilson
GoodReads: Appointment in Samarra by John O’Hara, Loving by Henry Green, The Death of the Heart by Elizabeth Bowen, Under the Net by Iris Murdoch, and An American Tragedy by Theodore Dreiser
Although they aren’t the first authors that come to mind when I think of William Faulkner (I was expecting Flannery O’Connor), I do love me some Henry Green and Elizabeth Bowen. And to be completely frank, I have no fucking idea what any of the Booklamp recommendations are. Irene Watts? Leo and the Lesser Lion?? Maybe I’m just ignorant and missing out on great fiction . . .
Leo and Lesser Lion. Listed by the publisher as Juvenile Fiction: A heartwarming family story set during the Depression that reads like a classic.Everyone’s been down on their luck since the Depression hit. But as long as Mary Bayliss Pettigrew has her beloved older brother, Leo, to pull pranks with, even the hardest times can be fun. Then one day, there’s a terrible accident, and when Bayliss wakes up afterward, she must face the heartbreaking prospect of life without Leo. And that’s when her parents break the news: they’re going to be fostering two homeless little girls, and Bayliss can’t bear the thought of anyone taking Leo’s place. But opening her heart to these weary travelers might just be the key to rebuilding her grieving family.
Oh, my. Booklamp fail. “Juvenile Fiction”?? I’ll bet $1million that there’s not a single stylistically interesting about this novel. And I’ll also bet that there’s no possible way I’d read this book and be like, “wow, I’ve never read a book more like Sound and the Fury than Sandra Forrester’s little gem!” Fuck. And no.
So far Booklamp is coming in a distant second . . . And exposing all the flaws of this top-down recommendation approach. This system really doesn’t seem to account for writing style, which, in my opinion, is maybe the most important feature of any work of fiction. Sure, it’s got “automobiles” and I do like to read about people who move from one point to another, but if the writing is shitty, there no number of “automobile/vehicle” scenes that will save a novel. And how do you identify the “genome” for exciting writing? (Not to bang a dead drum, but this is why booksellers and librarians and actual readers are so goddamn important.)
Let’s try one more, this time with feeling: Independent People by Halldor Laxness.
Booklamp: the first four are all Laxness books, which seems a bit of a cheat, so I’ll skip those . . . The Writer and the World by V.S. Naipaul, Dead Souls by Nikolai Gogol, Agnes Grey by Anne Bronte, A Country Doctor by Sarah Orne Jewett, and Walt Whitman’s Secret by George Fetherling
GoodReads: Njal’s Saga by Anonymous, Angels of the Universe by Einar Mar Gudmundsson, The Pets by Bragi Olafsson, The Blue Fox by Sjon, and Growth of the Soil by Knut Hamsun.
Interesting to end with this book . . . Booklamp’s recommendations are a lot less country-specific compared to GoodReads. But on the whole, I think I like the GoodReads recommendations better, especially since they include one of our books.
There’s no conclusion to this post except that I find this all very interesting in theory, and flawed in execution. I’m sure both of these sites (and Bookish) will get better and better as time goes on and data accumulates, and will play a larger role in how books find an audience as booksellers continue to decrease in number . . .