Summarizing YouTube Videos with LLMs

January 22, 2025
Tags:
ai
video
web

Over the holiday break I was thinking about taking transcriptions of YouTube videos and running them through an LLM for summaries. My inspiration was the dust up around Honey coupon code browser extension. Most of the news pointed to MegaLag’s video Exposing the Honey Influencer Scam. It’s a 23 minute video which isn’t too bad, but I wondered whether a summary would help here? Python Solution

A flowchart showing a process from YouTube Channel to RSS Feed, Audio Download, Whisper Transcript, LLM Summarize, and finally Email, with arrows indicating the progression between each step.

I built a simple pipeline in Python that takes a list of YouTube channels and produces a summary email for each published video:

YouTube channels have a standard RSS feed which makes this first step very simple. From the video URL I’m using yt-dlp to download the “best” audio stream which comes down in WebM format. Each video has a range of different audio qualities you can download, but this seemed to work fine. I didn’t go back to see if “worst” audio produced worse transcriptions, but the download is so small it doesn’t really matter. However what does matter is downloading from YouTube is painful and there’s a constant battle with the download tools out there. Since this is just for personal use, I’m using my own cookies and that seems to be pretty reliable. It would be tricky to build a real service like this unless there was a real YouTube API for getting this data.

For transcription I’m using OpenAI Whisper running locally. It has 6 different model sizes with tradeoffs in speed and accuracy. Right now I’m using “turbo” which is pretty high on memory usage but gives good quality results. This is definitely the slow part of the pipeline; the time to transcript goes in hand with the video length.

I started to run this project on a Digital Ocean droplet, but my usual cheap choice (1 GB memory, 1 GB disk) definitely did not have enough memory to run Whisper. I did some quick tests with the OpenAI Speech to text API which worked great, so I might switch to that. In the meantime, this project runs on my MacBook Air.

To summarize the generated transcript text, I used my favorite LLM command line tool from the Datasette project and the GPT-4o mini model. These are the prompts I used:

“summarize this transcript of a youtube video” → this one is very simple but you could also make it a bit richer by asking for pull quotes for example
“reformat this text to add punctuation and paragraph breaks where it makes sense; try to keep as much of the original text, while making it easier to read” → this helped make the full transcription text much easier to read while not changing the content too much

Also this was the first time I used Github Copilot chat and autocomplete for a project from the start. The autocomplete in particular was surprisingly good with its suggestions; I suspect because this type of Python code is pretty common and I wasn’t exactly breaking any new ground here, but I had multiple “that’s exactly what I was going to type!” moments. I had mixed results with chat; for example I asked for a simple example of sending email with Python and it gave me a very convoluted example that didn’t actually work. Luckily the Python email module examples were all I needed.

Example

Going back to the Honey Influencer Scam video, here’s a sample of what the summaries look like.

First, the simple “summarize this video” result:

The video outlines an extensive investigation into Honey, a free browser extension owned by PayPal, that promises to automatically apply coupon codes at checkout to save consumers money. The narrator alleges that Honey not only exploits consumers but also engages in deceptive practices that defraud influencers and content creators out of their rightful affiliate commissions.

Key Points from the Investigation:

Commission Theft: When consumers use Honey after clicking on an influencer’s affiliate link, Honey replaces the affiliate cookie with its own, claiming the commission for itself, even if it doesn’t find a working coupon. This practice undermines influencers who promote products, as Honey effectively steals their commissions.

Honey Gold (Cashback) Mechanism: Honey introduced a cashback feature (now called PayPal Rewards) that encourages users to click through Honey at checkout to earn rewards. This system further benefits Honey by allowing them to claim commissions instead of paying it to the referring influencer.

Deceptive Practices: Honey often only shows coupons that are approved by partnering stores, which allows those stores to control which discounts are promoted. This contradicts Honey’s claim of finding the best deals available, leading to no transparency for consumers who believe they are getting the best price.

Influencer Response: Over time, some influencers, notably Linus Tech Tips, recognized that Honey was undermining their affiliate links and eventually ended their partnerships with Honey. However, the narrator expresses disappointment that these influencers didn’t publicly address these issues sooner given their significant audience reach.

Misleading Marketing Claims: The video critiques Honey’s marketing strategy, highlighting exaggerated claims about its capabilities to find every coupon code on the internet. The Better Business Bureau even investigated Honey’s advertising but ceased after Honey discontinued claim practices for “business reasons.”

Consumer Impact: The investigation suggests that the practices of Honey could cost influencers, bloggers, and other small content creators millions in lost commissions while misleading consumers into thinking they are getting the best deals. The narrator emphasizes that this deceptive model would not be tolerated in a traditional retail environment.

The video presents Honey as a largely predatory service that financially benefits at the expense of both influencers and consumers by exploiting the affiliate marketing system and manipulating coupon code promotions.

Second, here’s a snippet of the reformatted original transcript which I was pretty impressed with:

But now, you’ve probably heard about Honey, you know, the browser extension that saves you money. So, you have Honey installed? What’s Honey? Oh, no, no.

Honey? Honey is a free browser extension. Free browser extension. Free browser extension. But automatically applies coupon codes when you check out online. Free money, basically. Literally free money. It doesn’t make sense not to be using this.

So what’s the catch?

There is… okay, join Honey. Join Honey. Honey. Honey. Honey. Honey. Honey.

Yeah, I hate to break it to you, but your favorite influencers sold you a lie. Honey is a scam, and the majority of claims promoted by those influencers aren’t even remotely true. But it gets worse. Honey hasn’t just been scamming you, the consumer; they’ve also been stealing money from influencers, including the very ones they pay to promote their product. And I’m not just talking about a few bucks here. I believe the scam has likely cost content creators millions of dollars.

Sound crazy? Well, I didn’t believe it at first either. Until I experienced it myself, firsthand. In fact, I’m confident this might just be the biggest influencer scam of all time, which is insane considering Honey is owned and run by PayPal, who purchased this company for $4 billion. This three-part series is the result of a multi-year investigation where I believe I’ve uncovered signs of advertising fraud, affiliate fraud, the illegal collection of personal data, deception, lies, coercion, extortion… the list goes on. I’ve reviewed hundreds of documents, advertisements, sponsorships; I’ve reviewed emails between Honey and merchants, interviewed victims—believe me, this runs deep.

Now, I want to be clear: the views, allegations, and conclusions expressed in the series are my opinions, based on evidence I have gathered, which will be shared throughout. With that said, ladies and gentlemen, this is the Honey Trip.

Future Ideas and Conclusion

Future ideas to explore:

Include a time-stamped transcript in the result (Whisper generates a file you could use for this)
Explore Whisper models to see if a faster model can be just as accurate
Switch from local to cloud Whisper model and try running everything from a small Digital Ocean droplet
Play around with the LLM prompts more

So does this meet my goal of making videos easier to understand without watching? I think so, especially for the videos which are mainly narrating as opposed to building or doing something. I’m running with a set of 6 YouTube channels I follow to see how it goes.

← Previous
Forced to Retire Weather by Text Service
Next →
From Laptop to Cloud: Scaling YouTube Video Transcription with LLMs