Don't mean to be snarky, but this is not how easy it is to get ad revenue. It's how easy it is to get approved for ad networks. She didn't even get adsense. I find it completely unremarkable that anyone could set up a non-adult site that has human generated content and get ads placed. The traffic needed for real ad revenue is a different story. I bet that site gets close to zero traffic (not even enough to cover hosting). SEO (black-hat or whatever) is the trick IMO not getting ad revenue. Plagiarized new domains get no weight in the Google engine.
Honestly, it'd be pretty interesting to see an article like this which continues after the 'approved for ad networks' part and shows how such a site could rank in Google, do well on social media sites, etc.
Could be interesting to see how scammers are doing that, and lead to some potentially interesting insights about black hat SEO, social media marketing, targeted ads, etc.
Because yeah, as you said, getting approved by an ad network is only part of the story, and not very much of it at that.
It may be a low bar, but being able to automate this (scraping website ansible playbook?), makes the effort required as near-nil. They only have to clear $50 to pay for ALOT of domains and hosting.
Sure but you need thousands of legitimate-seeming pageviews to get that $50 back, and the networks - even or especially the bottom tier ones - are likely to be hotter on click fraud than scraped content.
And 16,000 page views without unique content and links might happen, but most likely not within a year or five.
If you're lucky, you trigger something in Google's black box and they rank your site better than others at the same level, but you'll still only do long tail, and even on long tail, you'll compete with the original source of the article, which has a billion links pointing to its domain. Since you'll also need to go for quantity, you'll have a giant amount of pages as well, which will not help you even with niche rankings.
I doubt that the site would pull in 10 actual, human visitors per day on average with just scraped content.
its really not easy to get the ad revenue, scraped websites can't rank good and get enough visitors, of course one can make fraud clicks system, but if one can do this level, he may easily find more interesting things, but not peanut $$
It’s a dirty business, no cost is too great for eyeballs.
In Turkish whatever, you search the first few pages of results is from the Turkish largest news outlets because the SEO’ed for everything and Google doesn’t care.
Do you want to learn how to renew your driver license? Good luck with that because your search results will bring you a wall of text articles that are almost the same for every search term.
“Lately people started to ask themselves how to renew their driver's license. But do they consider the risks of renewing drivers licenses? Experts agree that renewing the driver's license can be a complicated thing. Now strap on and get ready to learn how to renew your driver's license”
Think to have pages like that on CNN, BBC and others. They are the top result for so many searches.
Plagiarism of news, on the other hand, is more nuanced IMHO. There’s nothing stopping you to say “NBC reports that” anyway. As per the article, you can not use their assets but you can create or even generate articles about the news based on the news.
The ad business is dirty. I’m almost proud of blocking ads.
Oh, definitely. Especially in these pandemic days that was something that I tried and failed. On the Turkish web apparently the news outlets gave up any hope of respect and now every single one of them is doing it. The biggest ones, the leftie ones, the right-wing ones, the cushy with the government ones. All of them.
No way Google isn't aware of this, there's a local Google office in Turkey, they have a large presence and full Turkish language support on most of the products.
Maybe it's simply part of the business model now. If a supermarket wants people to find their opening times, maybe they should buy an ad placement. There's no money in the high-quality organic search results I guess.
I'm watching this exact problem for SEO in Turkey and I can say that it was in full-force way before pandemic. Google doesn't care. Instead, they're busy flagging pages discussing "penisilin (penicillin in Turkish) application for kids" as AdSense Policy Violation since the page contains "penis".
I suspect one of the hard parts of this for Google is that many news sites legitimately publish the same articles because of wire services and correspondence arrangements, like AP and Reuters. Hard to tell whether the new site is plagiarizing or syndicating.
She did say: “I didn’t want to be taking ad revenue from legitimate advertisers, so I only briefly activated advertisements from the partners to see what surfaced and to take a few screenshots.”
If you include the hit to your professional reputation from actually plagiarizing a news site for revenue and getting blacklisted from the industry, then what did it cost? Everything.
I'm not. I knew a developer who, in his spare time, developed a clever scraper. He scraped the top stories and results from Google, then scraped similar content based on Google's own ranking, then submitted that content to his own aggregator sites (all resolving to the same server). He ran ads on it. He got plenty of traffic and was net <$500/month in 2006.
It's not that expensive to run a site and the right advertising partners (cough Taboola cough) pay nicely.
i remember in 2004 or so when an agency i worked for had a wikipedia clone running with adsense and tons of SEO which made 20k per month and basically kept the company afloat. I was a young junior dev and while i was impressed by it, it never felt right to me (which it obviously wasn't in many ways). As far as i remember this only worked for about a year at best until Google penalised those sites more and more.
US military has already been working on this for ~ a decade.
There was a contract out of Redstone Arsenal where they writing "story spinners" to scrape and re-word war-time propaganda.
>"It all underscores the fact that the ad tech space is so convoluted, it’s easy to make money from legitimate advertisers just by setting up a web page.
That means there’s significant incentive to create sites with not just with low-quality clickbait or A.I.-generated nonsense, but sites filled with outright plagiarized content."
It's really easy to get any content on the internet but really hard to verify if they are plagiarized. Basically anyone can place some ads on their websites, but if the site posts nothing but copied content, I doubt if it will last.
Often it's just affiliate fraud, you load casinos, aliexpress and what not affiliate links as be hope for the payout. That is why they redirect like crazy in order to hide the tracks since the sites offering affiliate services don't want it.
Because it isn't full articles -- just updates as they happen coming straight from the newsroom, more like tweets. It's not to say that people can't plagiarize, but it wouldn't be as easy or make as much sense as just copy and pasting an article.
There's not much in the post so I'm gonna guess it's a form of content fingerprinting like we see with YouTube's Content ID, plus whatever is used in plagiarism-detection software used in schools and universities.