Content Scraping and a new addition

Today, I got an e-mail from some legitimate-looking site asking if they could use a web crawler to archive my blog on their site, so that people in 5/10/15/20 years can research gaming, critics, etc. 

I say “ask” but they actually only told me that they will do that… and that I can refuse if I want to. I responded that I don’t want them to do so.

Alas, I’m making a post about it as I didn’t really have a post for today.

Research is great and I support it fully but I don’t get why researchers wouldn’t be able to just check out my blog in the future as well. Sure, WordPress may not work in the future… Nah, just kidding. There may happen something to my blog or my site that will stop me from ever posting on here again… But I’m sure that my posts will persist on the world-wide-web without any issues even if I don’t want it to be. Nothing gets lost on the internet after all, right?

But the way they did this was rather ugly. They formulated everything in their e-mail so overly flowery, hiding their intention, to the point where I had to ask Frosti if he could translate it for me. At first, I was wondering if this is spam but after checking site upon site and sources, as well as reverse-image-searching for that woman that mailed me, I found out that it’s actually legitimate. Alas, I found it weird that they didn’t use language that makes it easier to understand.

Alas, I don’t really know how this won’t affect my blog’s performance and why people wouldn’t just ask me any questions in case they want to research my blog. If there was one researcher or scientist who would ask for permission to use my site, I’d allow it probably (don’t take that as permission btw, e-mail me instead). It’s a different story to just scrape off content like that, factually stealing it, and then uploading it to another public site where it’s just going to get checked out by people that won’t have to visit my site. 

My blog works in the same way that their archive works… with the simple difference that my blog and all content hosted on here is owned by me. I mean, the words I wrote and the thoughts I thought were my intellectual property, right? 

So I declined the offer. But I’m sure there is some site somewhere that is doing that already and I don’t have the resources to check every single site on the internet, I guess.

Alas, I thought I’d introduce something to my blog that a lot of other bloggers also have on their sites… the following block:

This post originated on Indiecator and was first published on there by Dan Indiecator aka MagiWasTaken.

Is it gonna do a lot? Probably not. Will it protect me from worrying that my posts are getting used somewhere else to generate money for other people? A little bit.

The big idea here is that I’ll basically just put that in all of my 261 posts so far (or at least most of them) and the many more to come… At the same time, people potentially will find that post and get lead to my site where it actually originated from. The catch is that I’ll have to add this reusable block to 261 more posts… and I’m kinda annoyed by that already… oof.

Maybe I’m being a bit sensitive about this or a bit paranoid… but I don’t want other people to earn money off of stuff that I created, especially when I don’t earn a cent in the first place and when I wouldn’t receive anything from them. I feel like that’s fair enough, right? 

What are you guys’ thoughts on this? Have you had to deal with people stealing your posts before? What have you done against that? Any other suggestions on how to make this place safer against that? Let me know!

Cheers!