This came up at WordCamp Chicago. We were in the unConference, talking about BuddyPress, when John James Jacoby asked how many of us use Blackberry Pie. This is a WordPress plugin that lets you include Tweets into the content of your post, bringing the style (avatar, CSS etc) from Twitter. Then he asked if that would be cool to see in WordPress, where we could bring any other WP content to our sites.
A lot of people liked that idea. I visibly cringed. Sam “Otto” Woods made a joke about how that would bring splogs to a new level. But I was stuck on thinking about copyright and bandwidth.
The basic idea is sound and, I agree, really cool. If I can link to my site on Google+ or Facebook, and it pulls in a picture, couldn’t it pull in my formatting too? That would mean when you quote from my site, it looks like my site! The problem I see, is that the reason Facebook and Google can get away with embedding the picture is that they are copying your picture to their server and displaying it from their sites.
How do I know that? Because I have hotlink protection and I know it works. So the only way for these sites to cache my images is to come and scrape what they see IN the post. (I’m pretty sure this is why the ‘featured image’ in WordPress doesn’t always show up on a Facebook link. If you have the image inside the post, it always shows up. If you don’t, it doesn’t. Simplest answer: they’re content scraping.) It’s nice to see that technology being used for good. Of course, if you extend the thought, you’ll realize how many servers these sites must allocate just to storing snippets of other people’s data.
If your server cannot do that, then you should not be trying to emulate them.
People forget just how much work went into making Google and Facebook able to do that! They aren’t aware of how many servers, and how many people maintaining the servers, it takes to support that level of infrastructural deployment.(Mind you, WordPress has about 4 or 5 people to Facebook’s couple hundred, so it’s not about the amount, but the people.) This makes the problem two-fold. Either you must have your server set up to handle the caching, or you steal the CSS (and thus bandwidth) from someone else.
Okay, okay, so CSS bandwidth is a drop in the bucket compared to images, I know. And maybe I’m making a mountain out of a molehill, but we already know exactly how dangerous it is to have your site heavily linked. I’ve suffered the Digg/Slashdot/ma.tt effect before, and been nailed with 300% traffic. Thankfully I built this server with that end-goal in mind, and the last time it happened, no one noticed. Which is as it should be. But if I was still on my old shared host, I’d probably have died.
This cuts back to why certain things are made plugins/add-ons and others are default a part of a product. When you support ‘all’ things, you have to limit your product to what actually is supportable. Microsoft Office works on most systems, but it doesn’t work on all, and it has known conflicts. Because of those conflicts, there are features Microsoft knowingly left out! They would rather support as much as they can for as many people as they can. If your product is a niche product, you can get away with only supporting certain things, but a web app (Drupal, WordPress, etc) cannot. (And this is why you won’t see caching built into WordPress. Too many different server setups!)
In the end, I think that embedding contextual content in a site is a nice idea, but unfeasible. You’ll never be able to support all possibilities, and you’ll never be able to do it in a way that ensures you stay on the right side of the law. If you want to link to someone, make a quote, link back, and use it as a part of your site, branded your way. If the look and feel of the post is important (like Twitter or YouTube), then hope they’ve come up with a way where they want you to be able to embed the content in your site.
Until then, share my content, but leave my style alone on my site, where it belongs.