Booklamp's Algorithms Pick Reads For You
MIKE PESCA, host:
Perhaps you are familiar with the music site pandora.com. You type in a band or a song that you like and it analyzes the melody, harmony, rhythm, instruments, orchestration, arrangements, lyrics, kind of everything that goes into that song, and then it spits you out other songs that you might like. The website booklamp.org is like Pandora for books, although they say that they started working in 2003, before Pandora.
You type in your favorite book and the site's algorithm scans for books with a similar level of action and amount of description, dialogue, tense and perspective, similar books that you might like based on the one you typed on. The prototype looked at more than 200 books. It plotted 729,000 data points across 30,293 scenes, but the creator is hinting at a major contract that will push his idea into the big time. Aaron Stanton is the 26-year-old creator of booklamp.org. He joins us from Boise, Idaho. Hey, Aaron.
Mr. AARON STANTON (Creator, Booklamp.org): Hey, Mike, how are you?
PESCA: Good. So, you started in 2003. You were 21 at the time?
Mr. STANTON: Well, I would have to do the numbers, but yeah, somewhere in the area. I was a sophomore at the University of Idaho.
PESCA: And when you started, did you say, wow, I like some books. I don't like others, here are the things that books have in common? Did you start it saying, I want a better way to recommend books for myself?
Mr. STANTON: Actually, well, kind of. Actually, I started it from the other perspective. I was a writer and I wanted to figure out if there was a way that I could objectively compare my writing to other writing out there. I wanted to know how I was doing comparing to Stephen King or Michael Crichton or whatnot, and so, I kind of looked at it from the other perspective at first. You know, I wanted to know what the average first scene was.
PESCA: Yeah, that makes sense, as you try to improve your craft, you're like, hey, look at that. I think, with screenwriting, there are a lot of programs and courses out there where they're pretty scientific and methodical about that, right? They say, your action has to come in 20 minutes into the first scene and there are three acts. But with novels, people don't give you that kind of very terse advice, I think.
Mr. STANTON: Right, right. Not only that, but kind of our concept is not so much that there is, you know, there is a set formula for everybody kind of thing. The whole point, of course, is that you will have a different formula you prefer than me and so forth, and our idea is to actually try to find what each individual person's personal formula is. We don't try to tell people what good books are. We just try to tell you what you think a good book is for you.
PESCA: So, what are the things that you - I went to the site and I have to say, right now, I wouldn't recommend it to the masses, because there are only, it looks like, a couple hundred, or a couple dozen books in there right now.
Mr. STANTON: Right. Actually, there's actually about 300 books in right now.
PESCA: A lot of sci-fi, for some reason.
Mr. STANTON: Say again?
PESCA: A lot of sci-fi?
Mr. STANTON: Right. Well, that's kind of just a - that was kind of a personal choice, because a lot of people we recruited as beta testers tended to be sci-fi users - readers, and I personally like sci-fi. But no, definitely the system is really designed to have 10,000 to 100,000 to a million books in it before it's really useful for recommendations. What's there now is what we call technology demonstration. It shows off what we're doing.
PESCA: Right. And I did get an idea of what you're doing, even if the one book I found in there that I had read, "1984," it told me that I would enjoy - do you know what happens when you say you like "1984"? Do you know what it spits back as your number one choice?
Mr. STANTON: Oh, yeah, actually that's an Easter egg we programmed there specifically as a joke.
PESCA: OK, so it told me I would like the U.S.A. Patriot Act. OK, don't like the U.S.A. Patriot Act. I saw how it ends from a mile away. But what are you scanning for? Vocabulary words? Plot? The tension in the book? What else?
Mr. STANTON: Well, actually, what we ultimately say is that we analyze books for writing style. So, you know, is it dense language, light language? What is the description level? What is the reaction level, the dialogue level, the pacing level? Things like this.
PESCA: When you say like, dialogue level, does that mean amount of dialogue?
Mr. STANTON: Well, actually, when we specifically say dialogue, we refer to the amount of back and forth exchange that occurs between multiple characters.
Mr. STANTON: So, you can also - I mean, technically, having a single page that's a monologue would be a lot of dialogue, except for that's not exactly what we're interested in. We want to know exactly when two characters in a room, is it one sentence with a whole bunch of prose describing the room or is a whole lot of back and forth?
PESCA: Right. You know, Elmore Leonard has a lot of dialogue and back and forth. Well, why is this better than what people do without an algorithm? Which is something like, hey, you like those first two Tom Clancy books? You'll probably like the third.
Mr. STANTON: Right. Well, actually, I mean, to say it's better than finding - if you know you like one author and you want to read other books like his, rather than the same books by him, that's not necessarily any better or worse. I mean, basically, what we're trying to do is find a system that has sequel-quality recommendations. If you like this book, you'll like other books by the same author, but also you can branch out to other authors and find equally good books.
You're not tied to any specific - I've read every Stephen King book out there. I'd like to find other authors that are as good. That's ultimately what we're trying to do. What we're trying to do, actually is, let's say, there's three elements to a book that makes a book good to you, and that's storyline, plot - or plot, characters and writing style. I'd say I'd be hard-pressed to say one of those three is better than the other, but writing style is certainly important and currently not really being rated any others, so we try to give you a recommendation of all three.
PESCA: Does this work better for books that are meant to be, you know, beach reads or, you know, I guess what they call mass paperbacks, as opposed to literature?
Mr. STANTON: Well, I don't think so. Now, obviously, lots of people have different opinions on things, but ultimately, you know, if you're reading "Moby Dick," it has a - all language, or all the books and storylines that are out there have to be filtered at some point through the language that they're written in. And whether that's a serious book with very complex storylines, you still have to like the way it's told, otherwise you're not going to get very far into the storyline or the characters. The same thing applies, too, if it's, like you said, a beach read or a light, summer read.
PESCA: Is there anything that you found that's strangely correlative, like, you know, Patricia Kingsolver and Hemingway seem to have nothing in common, but they both use commas the same way and it turns out that people who like one like the other? Is there anything really weird like that that jumped out?
Mr. STANTON: Well, at this point, it's a little early to say, because, again, with 300 books in the database kind of as a test to the technology, those kind of strange correlations aren't going to show up quite yet. Once we get the 10,000-book mark, that's the point when we really expect to see odd things, and then there will be all sorts of - well, we expect all sorts of bizarre correlations. If you want to try looking for them, and we will, might show up. So, who knows? We'll see.
PESCA: What do you need from people? Do you need people to, you know, scan in books or kind of code books or just to go to your site? What do you need the masses to do if they want - if they want Booklamp to really take off?
Mr. STANTON: Right. OK. So, we've actually had a very, very positive public response. We've - originally when the project started and announced, we've gotten hundreds and thousands of emails over the course of the project from people just - librarians and so forth, saying this is really cool, if you need any help. And right now what we say is, hey, simply watch the site. We're not asking people to scan books for us.
Our approach is to talk to publishers directly. There are copyright issues otherwise, and there are also, well, just the amount of work that it would take to do that for usual people is very great. So, what we say is, go give us feedback. Let us know if we're taking the company in the right direction. Are you interested in this? Is this something that you find should be developed or shouldn't be developed? Let us know. I mean, ultimately, we are very, very public about what we do. We try to keep people informed from the ground up on the progress of the project, and feedback is immensely valuable to us.
PESCA: Yeah. Well, I want to thank you very much.
Mr. STANTON: I appreciate it.
PESCA: Booklamp.com's founder and evangelist is...
Mr. STANTON: Dot org.
PESCA: Dot org, oh, sorry. Booklamp.org's evangelist is Aaron Stanton. Thanks a lot, Aaron.
Mr. STANTON: All right. Thank you. Have a good day.
NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.