API Update : New Transcript API and Much More : Inside NPR.org With the new site launch, NPR has also made some significant changes to the API. Most notably, we have added our new Transcript API. Find out what other changes we made and tell us what else you would like to see in our API.
NPR logo API Update : New Transcript API and Much More


API Update : New Transcript API and Much More

As mentioned, our site redesign resulted in more changes than just the visual ones on the site. Another change that was part of this launch that we are very excited about is the significant updates to the API, as follows:

Major Additions

Transcript API
We are excited to introduce our new Transcript API. This API offers all of our transcripts dating back to May, 2005. As of today, we are opening up over 80,000 transcripts through this API, and this number will grow with every new radio story we produce. This API contains the same transcripts that are now offered on the new NPR.org.

DISCLAIMER: The transcripts that we do have are only for our main programs, including All Things Considered, Fresh Air, Morning Edition, Talk of the Nation, Tell Me More. Weekend Edition Saturday and Weekend Edition Sunday. The transcripts take a while to produce, so they are typically not available until several hours after the program is over. Finally, while we believe the transcripts are largely accurate, there are some cases where they may not align perfectly with the audio or have grammatical or spelling errors in them.

Added Even More MP3 Files
Today, our MP3 repository goes back to 2005, all of which is available through the Story API (to the extent that we had rights to distribute the MP3s). As of later this week, we will be back-filling our repository with MP3 files dating back to 2001 for Morning Edition and All Things Considered. The rest of the programs will be extended to go back to 2003. With this offering, we are providing about 200,000 unique MP3 files through the API totalling more than 15,000 hours of MP3 audio. And these totals will grow as we add MP3 files for new stories and as we continue to back-fill our MP3 repository over time.

Improved Query-Ability : Query Filtering
With this release, we have added the ability to apply a new parameter to your Story API queries, called "requiredAssets". This parameter can receive "text", "images", "audio" or any combination of them. This list will also expand over time. By using requiredAssets, your query will tell the API to only return those stories that have the specified asset. So, for example, if your query has requiredAssets=audio,images, only stories that have at least one audio file AND one image will be returned. RequiredAssets is not currently an option in the Query Generator but will be added soon. In the meantime, this parameter will have to be manually added to your query.

Other Enhancements or Changes

Topic Changes
For the new site, we modified our topic structure to better reflect the kinds of stories that we produce and the way our users will navigate the site. Some of our old topics were retired, others were renamed, while a few were split and moved. For all of these cases, as strong believers in maintaining backwards compatibility, we set up server-side redirects to point all killed topics to new topics that closely (or exactly) relate to them. No old topic ID should fail, neither in returning results nor in having those results be sensible for that topic. The Topic ID list (XML) in our API reflects all of our current topics. Retired topics will continue to be valid and available, although new stories will not be added to them.

You can see all of the changes to our topics here (PDF).

Thumbnail Images
Prior to this release, thumbnail images were only displayed in their own XML element, called < thumbnail >. That element still exists (for backward compatibility). Because of the new format for the website, we now offer a < medium > and < large > thumbnail as sub-elements, the former being a 75 pixels square while the latter is 90 pixels square. In addition to the changes to < thumbnail >, the thumbnails have also been included in the standard < image > output elements.

Story List Element
For NPRML queries, we provide a < list > parent element as a container for the stories that get returned. This container, in addition to the title and teaser information that it has always provided, now offers links back the original API call as well as to an HTML page (if a related one exists). These kinds of outputs help the API be more REST-ful.

Version Upgrade
All of these changes culminate in a version upgrade. The new version is .93.

In addition to the new features mentioned above, I also think it is interesting to point out that the new NPR site is heavily dependent on the API. The dependencies are both in infrastructure and in enabling more extensive ways for efficient and expedient content creation. These topics will be covered in other blogs posts in this series. In our next post, however, we will discuss how we got started on the technical implementation: new tools, new processes.

We are very excited about these additions to the API and would like your feedback. Do these changes inspire new mashup ideas for you? What else would you like to see offered in our API's?

(By the way, for more conversations about NPR's API and other technical advancements, follow us on Twitter at @NPRTechTeam.)