Threw up a new site today that posts information about developers. Right now, you can only find people by programming language, but I'll add location too. Designs really a disaster at the moment, maybe I'll fix that.
Find ya some devs
Andy McSherry
Andrew McSherry's blog about tech-related stuff that really needs to be updated more often :)
Thursday, May 16, 2013
Sunday, March 31, 2013
Google Analytics April Fools
Seems the International Space Station has been surfing all my sites today :) The bubble moves around to wherever the station is orbiting at the moment. You can see it in the Real-Time overview (https://www.google.com/analytics/web/?hl=en&pli=1#realtime/rt-overview)
Saturday, March 23, 2013
A Response to a Complaint by Bitcoin Socially
The owner of Bitcoin Socially has complained about this page on one of my websites. Here's his email:
To start, I find it highly unlikely that I've violated any of your copyrights. Email addresses are likely not copyrightable. According to the US Copyright Office, "Copyright does not protect facts, ideas, systems, or methods of operation, although it may protect the way these things are expressed." An email address appears to me to be a fact, just simply an address at which someone can be contacted. Regardless, I've written to them regarding the matter and should hear back next week.
My use of your images seems to be protected based on the decision in Perfect 10 v. Amazon.com. According to this decsion, "the owner of a computer that does not store and serve the electronic information to a user is not displaying that information, even if such owner in-line links to or frames the electronic information." This is exactly the case on my website. My servers neither store nor serve these images, I simply provide browser instructions to display images Bitcoin Socially has made publicly available.
Update:
The US Copyright Office got back to me. As I expected, you cannot copyright an email address.
Since his email server appears not to be functioning and I can't reply to him, I've decided to post my response here.You did not ask to use our images nor to post our email address on badappreviews.com/apps/150044You will be given 10 days to take down our content or our lawyers will attempt to have the entire site taken down via the Digital Millennium Copyright Act. We will also file a suit for the illegally used content.Please take this matter seriously,Bitcoin Socially
To start, I find it highly unlikely that I've violated any of your copyrights. Email addresses are likely not copyrightable. According to the US Copyright Office, "Copyright does not protect facts, ideas, systems, or methods of operation, although it may protect the way these things are expressed." An email address appears to me to be a fact, just simply an address at which someone can be contacted. Regardless, I've written to them regarding the matter and should hear back next week.
My use of your images seems to be protected based on the decision in Perfect 10 v. Amazon.com. According to this decsion, "the owner of a computer that does not store and serve the electronic information to a user is not displaying that information, even if such owner in-line links to or frames the electronic information." This is exactly the case on my website. My servers neither store nor serve these images, I simply provide browser instructions to display images Bitcoin Socially has made publicly available.
Update:
The US Copyright Office got back to me. As I expected, you cannot copyright an email address.
Saturday, February 23, 2013
Bad App Reviews Now Has iOS Apps
Bad App Reviews, now has iOS apps. We've got about 90k of them listed now, but we're still filling in reviews. About 5k apps have reviews right now, adding at a rate of 3k apps/day. You can see them under the search or index, right along side their Android counterparts.
Mashable's HTML Intro
Noticed this ASCII art at the top of Mashable's HTML today. Seems it gets sent for every page on their site.
<!--
o o o + o
+ + + o + +
+
o + + o + + +
__ __ _ _ _
~_,-| \/ | __ _ ___| |__ __ _| |__ | | ___
| |\/| |/ _` / __| '_ \ / _` | '_ \| |/ _ \,-~_,- - - ,
~_,-| | | | (_| \__ \ | | | (_| | |_) | | __/ | /\_/\
|_| |_|\__,_|___/_| |_|\__,_|_.__/|_|\___| ~=|__( ^ .^)
~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,"" ""
o o o + o
+ + + o + +
+
o + + o + + +
-->
<!--
o o o + o
+ + + o + +
+
o + + o + + +
__ __ _ _ _
~_,-| \/ | __ _ ___| |__ __ _| |__ | | ___
| |\/| |/ _` / __| '_ \ / _` | '_ \| |/ _ \,-~_,- - - ,
~_,-| | | | (_| \__ \ | | | (_| | |_) | | __/ | /\_/\
|_| |_|\__,_|___/_| |_|\__,_|_.__/|_|\___| ~=|__( ^ .^)
~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,"" ""
o o o + o
+ + + o + +
+
o + + o + + +
-->
Tuesday, February 19, 2013
Scraping the Web Without a Proxy on Heroku
403 Forbidden: One of the biggest issues scraping websites. Eventually after bombarding any reasonably intelligent site with hundreds of requests per minute, they're going to cut you off for a period of time, if not outright ban. The common work around for this has usually been to get a list of proxies and rotate your requests through them. Thus, your traffic appears to come from different places and is less noticeable. However, there's a couple issues with this.
Another advantage is that Heroku prorates to the second. It doesn't matter how many dynos you spin up, just how long they stay alive. I've found it usually takes a rails dyno about 10 seconds to start up which is a pretty small penalty since you can usually run them for a few minutes before being blocked. You'll be easily saving the costs by not killing time in proxies.
To take full advantage of this, write your scripts to fail fast. After a few unsuccessful requests, kill the dyno. Then set up your scheduling to run constantly. There's a minimum time interval of 10 minutes for the scheduler, but you can set up multiples of 10 minutes. This way, you'll actually be able to run through thousands of different IP addresses a day without fear of getting cut off.
Proxies are slow
The nature of using a proxy should at least double your latency. Instead of going from A to B, you need to go from A to C to B. Furthermore, you're not likely the only one using it. Most public proxies get swarmed with requests and this adds bandwidth issues into the mix.Proxies only accept certain requests
Most public proxies only accept GET requests, and may limit the domains you can access for a variety of reasons. This isn't the case with all of them, but it could easily be an issue.Proxies expire
When using proxy servers, you'll need to keep a constantly updated list of available servers. They go down without notice and new servers surface all the time.A Better Solution
We can get around these issues by using Heroku Scheduler. The beauty of Heroku is each one has a different IP address. Their distributed around Amazon Web Services, which contains hundreds of thousands, if not millions of IP addresses. Every time you spin up a new dyno, you get a new IP address.Another advantage is that Heroku prorates to the second. It doesn't matter how many dynos you spin up, just how long they stay alive. I've found it usually takes a rails dyno about 10 seconds to start up which is a pretty small penalty since you can usually run them for a few minutes before being blocked. You'll be easily saving the costs by not killing time in proxies.
To take full advantage of this, write your scripts to fail fast. After a few unsuccessful requests, kill the dyno. Then set up your scheduling to run constantly. There's a minimum time interval of 10 minutes for the scheduler, but you can set up multiples of 10 minutes. This way, you'll actually be able to run through thousands of different IP addresses a day without fear of getting cut off.
Sunday, February 17, 2013
How to Track Pinterest's Pinmarklets with Google Analytics
Pinterest is a great platform for your users to spread the word about your website. In a short period of time, they've managed to become one of the top 50 most trafficked websites. Just through their Pinmarklet up, build a url with an image and a link, and you're good to go. However, it's also a black hole when it comes to tracking through any analytics platform. You don't know how many times your users have shared to Pinterest, and you don't know how much data is coming back from these shares. For this article, we're going to focus on Google Analytics, but the same strategy could very well be used for any analytics platform around.
function setPinterestCount(response) {
if (response["count"] != '0') {
$('.pinterest-count').show();
$('.pinterest-count > span').text(response["count"]);
}
}
</script>
<script type="text/javascript" src="//partners-api.pinterest.com/v1/urls/count.json?url=your_url&ref=your_url&callback=setPinterestCount"></script>
Tracking outgoing Pins
The first step is to be able to track outgoing pins. With the current Pin It button, slapping an onclick method on it won't get called. However, we can design the same looking link that even has the count bubble next to it. All you need for this script is to replace your_url with the page's current URL and your_bookmarklet_url with the href on your current button.
<a class="pinterest-button" target="_blank" onclick="_gaq.push(['_trackEvent', 'Pinmarklet', 'Pinned']);window.open(this.href,'_blank','status=no,resizable=yes,scrollbars=yes,personalbar=no,directories=no,location=no,toolbar=no,menubar=no,width=632,height=270,left=0,top=0');return false;" href="your_bookmarklet_url">Pin It<span class="pinterest-count"><i></i><span></span></span></a>
<script> function setPinterestCount(response) {
if (response["count"] != '0') {
$('.pinterest-count').show();
$('.pinterest-count > span').text(response["count"]);
}
}
</script>
<script type="text/javascript" src="//partners-api.pinterest.com/v1/urls/count.json?url=your_url&ref=your_url&callback=setPinterestCount"></script>
<style type="text/css">
.pinterest-button {
position: absolute;
background: url('http://assets.pinterest.com/images/pinit6.png');
color: #CD1F1F;
top:-11px;
height: 20px;
width: 43px;
background-position: 0 -7px;
}
.pinterest-count {
display:none;
padding: 0 3px 0 10px;
background-size: 45px 20px;
background-position: 2px 0;
position: absolute;
top: 0;
left: 41px;
height: 20px;
font: 10px Arial, Helvetica, sans-serif;
line-height: 20px;
background-color: transparent;
background-repeat: no-repeat;
background-image: url(http://passets.pinterest.com/images/pidgets/fpb1.png);
color: #777;
text-align: center;
}
.pinterest-count i {
background-color: transparent;
background-repeat: no-repeat;
background-image: url(http://passets.pinterest.com/images/pidgets/fpb1.png);
background-position: 100% 0;
position: absolute;
top: 0;
right: -2px;
height: 20px;
width: 2px;
}
</style>
Tracking Incoming Traffic from Your Pins
Unfortunately, Pinterest has decided to strip campaign parameters from all posts. So if you post the link http://blog.andymcsherry.com/page_path?utm_campaign=pin_it_button&utm_source=me&utm_medium=pinterest, the link on Pinterest will be http://blog.andymcsherry.com/page_path. We can get around this by specifying a page-path reserved for Pinterest. Simply share http://blog.andymcsherry.com/pinterest/page_path and set up your server to redirect all requests from /pinterest/page_path to /page_path?your_campaign_parameters. Then you'll be able to track this incoming traffic. While it'd be generally a good idea to do a 301 redirect in this circumstance, Pinterest uses rel=nofollow, so you don't need to worry about losing page-rank from these links. This method can also be used for referral links heading to Pinterest. Pinterest strips out known referral tags when users post to Pinterest (they actually used to append their own). If you redirect through your site in someway, you can ensure that these tags get added.
Labels:
analytics,
campaign tracking,
google analytics,
pinterest,
web
Subscribe to:
Posts (Atom)
