Team from SR tries to solve the challenge of better serving under-served audiences through AI and the Pick and mix podcast segmenter
Podcasting is a booming industry where numerous new players produce and distribute podcasts. Public service radio companies have an opportunity to better serve younger and more diverse audiences through podcasts. In linear radio, one size must fit all of them, but in podcasting personalised experiences are possible. Swedish Radio has a good position on the Swedish podcast market with some of the most loved titles and a massive catalogue of digital audio in many different fields.
This catalogue, however, is of limited value if the audience cannot find what they are looking for. Even our employees do not really know what is in our archive. In addition, audio is difficult to search: Google needs text and editors and journalists sometimes supply descriptive texts along with the audio but they most often do not. Moreover, one podcast can often contain several different topics. Therefore, if you are lucky enough to find the right episode, you might still not find the actual segment you were looking for but instead have to scroll back and forth to find the part where “your” topic is (picture 2 and 3).
A cross-funtional team of four (two software developers, one analyst and one journalist) from the SR has since spring this year worked on the 2021 JournalismAI Collab Challenges trying to find ways to better serve audiences that are underserved today by our content. The challenge for the Europe and Middle East part of the project was: “How might we use modular journalism and AI to assemble new storytelling formats and reach currently underserved audiences?” The JournalismAI is a global initiative that aims to inform media organisations about the potential offered by AI-powered technologies and is a project of Polis – the journalism think-tank at the London School of Economics and Political Science with support of the BBC News Labs and Clwstwr.
Through the Pick-and-mix podcast segmenter, we help the audience find what they are looking for and point them to the very point in the podcast where the interesting discussion starts. Instead of getting getting one to two hour long chunk of audio, you get a neat snippet so you do not have to scroll back and forth to find the right spot. Instead of hitting a wall of audio, you can therefore navigate smoothly between well-defined segments that are easy to find. We put the power in the hands of the listener to choose what they need from us at that very moment.
Through the Pick-and-mix podcast segmenter, our editors can find material that was previously lost in the archives or hidden in large blocks of audio. For the end user this means access to deeper stories and historical context to current affairs.
We do this through a process of segmenting, transcribing, entity extraction and matching the audio with other segments to find similar ones. The end product in this case is a user interface for editors to build playlists out of those matches. There is also a simple application available from using these segments to build timelines.
The team from Swedish Radio (SR) took a popular current affairs podcast, which is also a radio program, Studio Ett. The newsroom produces one episode every weekday. The show is approximately 1 hour and 40 minutes with about 8 to 10 different topics, typically divided by a short signature.
We have a proof of concept for using the signature to divide the larger audio into smaller segments automatically but for the demo, we used manual segmentation.
With all the topics in different segments, we ran it through a speech to text tool, to get a transcription.
The SR team then took the transcription and ran it through Radboud Entity Linker and DBPedia Spotlight to find topics, search terms and extract entities.
We also applied a model to find semantic similarities between texts called Contrastive Tension.
The end product in this case is a user interface that lets an editor search for terms and topics and gets hits on segments. Then he or she can move on, choose one segment, and find similar ones. The user interface will show you how good the match is through color codes.
The editor can finally export the segments to a playlist. It is possible to view the playlist on a timeline with all segments as points on the timeline depending on when they were aired. This could be very helpful by giving the user historical context to events that have been going on for a long time. One such area is within politics and the upcoming Swedish election where it might be helpful to see what has been said and what has happened to different talking points during the last few years.
Through this application, our editors can search for - and find - material that is hidden in our vast archive (picture 4). Moreover, the segments are separated out so they do not need to listen through hours of audio to find what they were looking for. The next step would be to make this application available to the audience (pictures 5 and 6).
Journalists could also follow a news story over time; see what was reported about covid for example throughout the pandemic and see on the timeline how it developed. To bring this tool even further and unlock its true potential, we envision serving a version of it to our end users, allowing them to accurately search our vast catalogue for interesting segments and build their own playlists and timelines of events (picture 7).
The Pick and mix podcast segmenter is presented at the Journalism AI festival on 1 December 2021: https://www.journalismaifestival.com/.
Chef för Playdesken / Head of Playdesk
Contact the team
Tobias Björnsson firstname.lastname@example.org
Kajsa Norell email@example.com
Erik Sillén firstname.lastname@example.org
Emil Tavassoli email@example.com
This project is part of the 2021 JournalismAI Collab Challenges, a global initiative that brings together media organisations to explore innovative solutions to improve journalism via the use of AI technologies.
It was developed as part of the EMEA cohort of the Collab Challenges that focused on “How might we use modular journalism and AI to assemble new storytelling formats and reach currently underserved audiences?” with the support of BBC News Labs and Clwstwr.
JournalismAI is a project of Polis – the journalism think-tank at the London School of Economics and Political Science – and it’s sponsored by the Google News Initiative. If you want to know more about the Collab Challenges and other JournalismAI activities, sign up for the newsletter or get in touch with the team via firstname.lastname@example.org