Andrew McSherry's blog about tech-related stuff that really needs to be updated more often :)
Monday, November 25, 2013
NCAA Football Rankings Algorithm
Released a NCAA football rankings algorithm today. It's pretty simple, but remarkably effective and unbiased. You can read more about it on Github.
Thursday, May 16, 2013
New Website Launched
Threw up a new site today that posts information about developers. Right now, you can only find people by programming language, but I'll add location too. Designs really a disaster at the moment, maybe I'll fix that.
Find ya some devs
Find ya some devs
Sunday, March 31, 2013
Google Analytics April Fools
Seems the International Space Station has been surfing all my sites today :) The bubble moves around to wherever the station is orbiting at the moment. You can see it in the Real-Time overview (https://www.google.com/analytics/web/?hl=en&pli=1#realtime/rt-overview)
Saturday, March 23, 2013
A Response to a Complaint by Bitcoin Socially
The owner of Bitcoin Socially has complained about this page on one of my websites. Here's his email:
To start, I find it highly unlikely that I've violated any of your copyrights. Email addresses are likely not copyrightable. According to the US Copyright Office, "Copyright does not protect facts, ideas, systems, or methods of operation, although it may protect the way these things are expressed." An email address appears to me to be a fact, just simply an address at which someone can be contacted. Regardless, I've written to them regarding the matter and should hear back next week.
My use of your images seems to be protected based on the decision in Perfect 10 v. Amazon.com. According to this decsion, "the owner of a computer that does not store and serve the electronic information to a user is not displaying that information, even if such owner in-line links to or frames the electronic information." This is exactly the case on my website. My servers neither store nor serve these images, I simply provide browser instructions to display images Bitcoin Socially has made publicly available.
Update:
The US Copyright Office got back to me. As I expected, you cannot copyright an email address.
Since his email server appears not to be functioning and I can't reply to him, I've decided to post my response here.You did not ask to use our images nor to post our email address on badappreviews.com/apps/150044You will be given 10 days to take down our content or our lawyers will attempt to have the entire site taken down via the Digital Millennium Copyright Act. We will also file a suit for the illegally used content.Please take this matter seriously,Bitcoin Socially
To start, I find it highly unlikely that I've violated any of your copyrights. Email addresses are likely not copyrightable. According to the US Copyright Office, "Copyright does not protect facts, ideas, systems, or methods of operation, although it may protect the way these things are expressed." An email address appears to me to be a fact, just simply an address at which someone can be contacted. Regardless, I've written to them regarding the matter and should hear back next week.
My use of your images seems to be protected based on the decision in Perfect 10 v. Amazon.com. According to this decsion, "the owner of a computer that does not store and serve the electronic information to a user is not displaying that information, even if such owner in-line links to or frames the electronic information." This is exactly the case on my website. My servers neither store nor serve these images, I simply provide browser instructions to display images Bitcoin Socially has made publicly available.
Update:
The US Copyright Office got back to me. As I expected, you cannot copyright an email address.
Saturday, February 23, 2013
Bad App Reviews Now Has iOS Apps
Bad App Reviews, now has iOS apps. We've got about 90k of them listed now, but we're still filling in reviews. About 5k apps have reviews right now, adding at a rate of 3k apps/day. You can see them under the search or index, right along side their Android counterparts.
Mashable's HTML Intro
Noticed this ASCII art at the top of Mashable's HTML today. Seems it gets sent for every page on their site.
<!--
o o o + o
+ + + o + +
+
o + + o + + +
__ __ _ _ _
~_,-| \/ | __ _ ___| |__ __ _| |__ | | ___
| |\/| |/ _` / __| '_ \ / _` | '_ \| |/ _ \,-~_,- - - ,
~_,-| | | | (_| \__ \ | | | (_| | |_) | | __/ | /\_/\
|_| |_|\__,_|___/_| |_|\__,_|_.__/|_|\___| ~=|__( ^ .^)
~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,"" ""
o o o + o
+ + + o + +
+
o + + o + + +
-->
<!--
o o o + o
+ + + o + +
+
o + + o + + +
__ __ _ _ _
~_,-| \/ | __ _ ___| |__ __ _| |__ | | ___
| |\/| |/ _` / __| '_ \ / _` | '_ \| |/ _ \,-~_,- - - ,
~_,-| | | | (_| \__ \ | | | (_| | |_) | | __/ | /\_/\
|_| |_|\__,_|___/_| |_|\__,_|_.__/|_|\___| ~=|__( ^ .^)
~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,-~_,"" ""
o o o + o
+ + + o + +
+
o + + o + + +
-->
Tuesday, February 19, 2013
Scraping the Web Without a Proxy on Heroku
403 Forbidden: One of the biggest issues scraping websites. Eventually after bombarding any reasonably intelligent site with hundreds of requests per minute, they're going to cut you off for a period of time, if not outright ban. The common work around for this has usually been to get a list of proxies and rotate your requests through them. Thus, your traffic appears to come from different places and is less noticeable. However, there's a couple issues with this.
Another advantage is that Heroku prorates to the second. It doesn't matter how many dynos you spin up, just how long they stay alive. I've found it usually takes a rails dyno about 10 seconds to start up which is a pretty small penalty since you can usually run them for a few minutes before being blocked. You'll be easily saving the costs by not killing time in proxies.
To take full advantage of this, write your scripts to fail fast. After a few unsuccessful requests, kill the dyno. Then set up your scheduling to run constantly. There's a minimum time interval of 10 minutes for the scheduler, but you can set up multiples of 10 minutes. This way, you'll actually be able to run through thousands of different IP addresses a day without fear of getting cut off.
Proxies are slow
The nature of using a proxy should at least double your latency. Instead of going from A to B, you need to go from A to C to B. Furthermore, you're not likely the only one using it. Most public proxies get swarmed with requests and this adds bandwidth issues into the mix.Proxies only accept certain requests
Most public proxies only accept GET requests, and may limit the domains you can access for a variety of reasons. This isn't the case with all of them, but it could easily be an issue.Proxies expire
When using proxy servers, you'll need to keep a constantly updated list of available servers. They go down without notice and new servers surface all the time.A Better Solution
We can get around these issues by using Heroku Scheduler. The beauty of Heroku is each one has a different IP address. Their distributed around Amazon Web Services, which contains hundreds of thousands, if not millions of IP addresses. Every time you spin up a new dyno, you get a new IP address.Another advantage is that Heroku prorates to the second. It doesn't matter how many dynos you spin up, just how long they stay alive. I've found it usually takes a rails dyno about 10 seconds to start up which is a pretty small penalty since you can usually run them for a few minutes before being blocked. You'll be easily saving the costs by not killing time in proxies.
To take full advantage of this, write your scripts to fail fast. After a few unsuccessful requests, kill the dyno. Then set up your scheduling to run constantly. There's a minimum time interval of 10 minutes for the scheduler, but you can set up multiples of 10 minutes. This way, you'll actually be able to run through thousands of different IP addresses a day without fear of getting cut off.
Sunday, February 17, 2013
How to Track Pinterest's Pinmarklets with Google Analytics
Pinterest is a great platform for your users to spread the word about your website. In a short period of time, they've managed to become one of the top 50 most trafficked websites. Just through their Pinmarklet up, build a url with an image and a link, and you're good to go. However, it's also a black hole when it comes to tracking through any analytics platform. You don't know how many times your users have shared to Pinterest, and you don't know how much data is coming back from these shares. For this article, we're going to focus on Google Analytics, but the same strategy could very well be used for any analytics platform around.
function setPinterestCount(response) {
if (response["count"] != '0') {
$('.pinterest-count').show();
$('.pinterest-count > span').text(response["count"]);
}
}
</script>
<script type="text/javascript" src="//partners-api.pinterest.com/v1/urls/count.json?url=your_url&ref=your_url&callback=setPinterestCount"></script>
Tracking outgoing Pins
The first step is to be able to track outgoing pins. With the current Pin It button, slapping an onclick method on it won't get called. However, we can design the same looking link that even has the count bubble next to it. All you need for this script is to replace your_url with the page's current URL and your_bookmarklet_url with the href on your current button.
<a class="pinterest-button" target="_blank" onclick="_gaq.push(['_trackEvent', 'Pinmarklet', 'Pinned']);window.open(this.href,'_blank','status=no,resizable=yes,scrollbars=yes,personalbar=no,directories=no,location=no,toolbar=no,menubar=no,width=632,height=270,left=0,top=0');return false;" href="your_bookmarklet_url">Pin It<span class="pinterest-count"><i></i><span></span></span></a>
<script> function setPinterestCount(response) {
if (response["count"] != '0') {
$('.pinterest-count').show();
$('.pinterest-count > span').text(response["count"]);
}
}
</script>
<script type="text/javascript" src="//partners-api.pinterest.com/v1/urls/count.json?url=your_url&ref=your_url&callback=setPinterestCount"></script>
<style type="text/css">
.pinterest-button {
position: absolute;
background: url('http://assets.pinterest.com/images/pinit6.png');
color: #CD1F1F;
top:-11px;
height: 20px;
width: 43px;
background-position: 0 -7px;
}
.pinterest-count {
display:none;
padding: 0 3px 0 10px;
background-size: 45px 20px;
background-position: 2px 0;
position: absolute;
top: 0;
left: 41px;
height: 20px;
font: 10px Arial, Helvetica, sans-serif;
line-height: 20px;
background-color: transparent;
background-repeat: no-repeat;
background-image: url(http://passets.pinterest.com/images/pidgets/fpb1.png);
color: #777;
text-align: center;
}
.pinterest-count i {
background-color: transparent;
background-repeat: no-repeat;
background-image: url(http://passets.pinterest.com/images/pidgets/fpb1.png);
background-position: 100% 0;
position: absolute;
top: 0;
right: -2px;
height: 20px;
width: 2px;
}
</style>
Tracking Incoming Traffic from Your Pins
Unfortunately, Pinterest has decided to strip campaign parameters from all posts. So if you post the link http://blog.andymcsherry.com/page_path?utm_campaign=pin_it_button&utm_source=me&utm_medium=pinterest, the link on Pinterest will be http://blog.andymcsherry.com/page_path. We can get around this by specifying a page-path reserved for Pinterest. Simply share http://blog.andymcsherry.com/pinterest/page_path and set up your server to redirect all requests from /pinterest/page_path to /page_path?your_campaign_parameters. Then you'll be able to track this incoming traffic. While it'd be generally a good idea to do a 301 redirect in this circumstance, Pinterest uses rel=nofollow, so you don't need to worry about losing page-rank from these links. This method can also be used for referral links heading to Pinterest. Pinterest strips out known referral tags when users post to Pinterest (they actually used to append their own). If you redirect through your site in someway, you can ensure that these tags get added.
Labels:
analytics,
campaign tracking,
google analytics,
pinterest,
web
Thursday, February 7, 2013
Not Selected Index Stats Have Disappeared from Webmaster Tools
It appears that Google has removed the "Not Selected" stats from their Google Webmaster Tools index stats today. This option was under the Advanced tab of Index Status. It was an extremely useful metric to determine how much of your content was considered valuable to Google. I was always hoping that they'd list pages that weren't selected so web masters could better gauge how to improve the content. You'll still be able to retrieve a count of the removed URLs which can be one warning sign, but it'd be useful to also see pages that never made the cut in the first place. Here's how it appeared before and after the change (these are not for my sites):
I haven't been able to find an official word from Google on the matter, but the consensus in the Webmaster Tools product forums is that it was removed because it was confusing to users. However this rationale for the change doesn't seem to be valid, especially since it was under the Advanced tab. I believe the real reasoning behind this has been to prevent users from testing what can index and what can't. Google doesn't want you to be able to test which SEO tricks you can pull to get indexed. By publishing this data, it could be too easy to test whether content was being flagged and expose flaws in their algorithms.
I haven't been able to find an official word from Google on the matter, but the consensus in the Webmaster Tools product forums is that it was removed because it was confusing to users. However this rationale for the change doesn't seem to be valid, especially since it was under the Advanced tab. I believe the real reasoning behind this has been to prevent users from testing what can index and what can't. Google doesn't want you to be able to test which SEO tricks you can pull to get indexed. By publishing this data, it could be too easy to test whether content was being flagged and expose flaws in their algorithms.
Why Your Android App Won't Port to Blackberry 10
I've read quite a few articles recently about how simple it will be to port Android applications to BlackBerry 10. It's been hailed as the cure for the meager app offering BlackBerry will have on its new platform. Naturally, I investigated this as a such a simple port could open up a new platform for many of my apps I've already written. However, I soon discovered the list of unsupported APIs, and my whole plan was crushed. Even if your app can still function, chances are it's feature set will be severely limited by these restrictions if it does anything interesting. Here's some of the biggest sticking points:
These are some of the limitations that would likely make your app fundamentally useless.
Potential Deal Breakers
- Services cannot run in the background. Once your app leaves the foreground, all background services are killed. Your app cannot play music, download data, schedule alarms, monitor location, or any of the multitude of reasons you might require performing work in the background.
- There is NO support for Bluetooth. None. If you're app requires Bluetooth, give up now. If you have a feature that requires Bluetooth, you need to find a way to disable it. Same goes for NFC, but that's likely less of an issue for most developers.
- Intent filters for ACTION_SEND and ACTION_VIEW from outside your app are disabled. If your application allows users to view or share images, text, files, URLs or any other data from other applications, your users will have to open your app first, and you'll have to provide a mechanism for them to import if from inside your application.
- The NDK is not supported. Game-over for many OpenGL apps that used C++. However, there is a native SDK for BlackBerry 10 so porting an Android application may not have been the best option anyway. There are however many other applications of the NDK that go beyond games that will not be able to port.
- No live wallpapers, widgets, home screens, lock screens.
Alternative Implementations
If you made it through the first list, congrats! Chances are you'll be capable of porting your application. However, there's a significant chance you'll have to rewrite substantial parts of your applications. Most of these are related to the absence of Google Play Services.
- Notifications will be limited to one line of text. If you've built fancy, interactive notifications, they'll need to be reduced down to the bare minimum. This will likely eliminate some functionality and cause maintenance headaches.
- BlackBerry will use a different push notification service than with Google. Not only will this require a different implementation in your app, but it'll require server-side support as well.
- In-app purchases will go through BlackBerry App World so you'll have create an alternative implementation for interacting with it.
- If your application uses Maps, you'll have to use an alternative web-based Google Maps API for displaying them. In addition to having to redo your work, it appears to be a more limited API with a poorer user experience.
- Google account authentication through Google Play Services will not be available, so you'll have to create an alternative route to obtain oAuth tokens.
Other Sources of Frustration
- Since Google Play services will not be available on the device, you won't be able to use the Android Backup services. If you need to remotely persist user preferences, there's currently no substitute so you'll have to create your own service.
- In app +1 buttons will not be available.
- Support for all the accessibility APIs is missing so any improvements you've made for the deaf or blind will be unavailable.
- Your application cannot add or modify the user's contacts so any improvements will be limited to use inside your app.
- You cannot set Thread priority. Your background task that sends Analytics is going to have the same priority as the UI thread.
There's dozens of other unsupported APIs and features, but these are likely the most difficult hurdles to get across. Furthermore, I noticed in the documents there seems to be a gap sometimes between what's stated as being supported and unsupported. The permissions list stood out the most here. It may be possible that there are more undocumented cases you'll come across so I'd be glad to here any feedback from any developers that discover more.
Friday, January 25, 2013
New Apps
Launched some new apps lately:
Wolfram Alpha RSS
Mom Finds RSS
Hello Giggles RSS
Brain Pickings RSS
Ceiling Fan Calculator
Space Heater Calculator
Wolfram Alpha RSS
Mom Finds RSS
Hello Giggles RSS
Brain Pickings RSS
Ceiling Fan Calculator
Space Heater Calculator
Monday, January 21, 2013
Tuesday, January 15, 2013
New Google Analytics Console
Huge fan of the new Google Analytics redesign. It's nice to not have to switch over to tab sections to see real-time stats, shortcuts, dashboards and intelligence events. Here's a snapshot of the new left navigation panel.
Sunday, January 13, 2013
Using Proguard with Android
Some reasons you might want to use it
Security
Proguard makes it easier to hide the implementation details and keys of the services you use.
Protect IP
The work involved in creating your project is often the primary reason you don't have competitors. Proguard obfuscates your code so your competitors can't simply decompile copy/paste your work.
Reduce APK Size
Proguard removes unused code, reduces the length of variable names, and inlines code. All this adds up to a smaller app size and less for your users to download.
Optimization
Proguard optimizes variable allocation, arithmetic, unnecessary code, etc.
Logging
Proguard can remove all your logging by removing the actual code that calls it. This is much simpler and more efficient than any other method.
Syntax
- http://proguard.sourceforge.net/#manual/usage.html
- # is used for comments
- Always use fully-qualified class names
Reflection
Reflection is easily the biggest gotcha when it comes to Proguard. It works by analyzing which methods and classes actually get used. Despite some safeguards, it's generally best to assume reflected code will get removed. You need to specify which methods and classes to keep. Generally, you should be as specific as possible because telling Proguard to keep large chunks of code drastically reduces it's efficacy.Setting up your proguard.cfg
Initial Template
#Remove all the injar/outjar/libraryjar junk, the android ant script takes care of this
-dontpreverify
-repackageclasses ''
-allowaccessmodification
-optimizations !code/simplification/arithmetic
-keepattributes *Annotation*
-keep public class * extends android.app.Activity
-keep public class * extends android.app.Application
-keep public class * extends android.app.Service
-keep public class * extends android.content.BroadcastReceiver
-keep public class * extends android.content.ContentProvider
-keep public class * extends android.view.View {
public <init>(android.content.Context);
public <init>(android.content.Context, android.util.AttributeSet);
public <init>(android.content.Context, android.util.AttributeSet, int);
public void set*(...);
}
-keepclasseswithmembers class * {
public <init>(android.content.Context, android.util.AttributeSet);
}
-keepclasseswithmembers class * {
public <init>(android.content.Context, android.util.AttributeSet, int);
}
-keepclassmembers class * implements android.os.Parcelable {
static android.os.Parcelable$Creator CREATOR;
}
-keepclassmembers class **.R$* {
public static <fields>;
}
First thing you should notice, is that we tell Proguard to keep all the classes (Activities, Services, etc) we declare in AndroidManifest.xml. These are known as the entry points to the application. These classes should not have their names changed because the Android won't be able to find them when needed.
Fragments
-keep public class * extends android.support.v4.app.Fragment
-keep public class * extends android.app.Fragment
Fragments are generally created through reflection. It's a good idea to keep them all.
3rd Party Libraries
-keep class android.** {*;}
-keep class com.millennialmedia.android.** {*;)
-keep class com.google.ads.** {*;}
Removing Logging
-assumenosideeffects class android.util.Log {
public static *** e(...);
public static *** w(...);
public static *** wtf(...);
public static *** d(...);
public static *** v(...);
}
assumenosideeffects tells Proguard that method calls to these don't actually do anything and can be removed if the return value is not used. android.util.Log does have a return value that's not particularly useful so keep this in mind.
Serializables
-keepnames class * implements java.io.Serializable
-keepclassmembers class * implements java.io.Serializable {
static final long serialVersionUID;
private static final java.io.ObjectStreamField[] serialPersistentFields;
!static !transient <fields>;
!private <fields>;
!private <methods>;
private void writeObject(java.io.ObjectOutputStream);
private void readObject(java.io.ObjectInputStream);
java.lang.Object writeReplace();
java.lang.Object readResolve();
}
I'm not entirely sure why this isn't just build into the platform. These methods and fields from Serializable are discovered at runtime and won't be used in your code.
Click Methods
-keepclassmembers class * {
public void *ButtonClicked(android.view.View);
}
Methods that are called by android:onclick in your xml layouts are referenced by reflection at runtime. My strategy is to suffix them all with ButtonClicked and save any method that ends with that.
Native Methods
-keepclasseswithmembernames class * {
native <methods>;
}
Native methods should not be obfuscated because they reference JNI methods relating to their name.
Obfuscating debug builds
It may be useful to obfuscate debug builds in a continuous integration environment. If so, you can modify your Android build.xml this way to accomplish it.<target name="-debug-obfuscation-check">
<!-- yes, we want to obfuscate in debug too!!!! -->
<condition property="proguard.enabled" value="true" else="false">
<and>
<isset property="proguard.config" />
</and>
</condition>
<if condition="${proguard.enabled}">
<then>
<!-- Secondary dx input (jar files) is empty since all the
jar files will be in the obfuscated jar -->
<path id="out.dex.jar.input.ref" />
</then>
</if>
</target>
Air Conditioning App
Published a new Android app today to help users find the perfect-sized window air conditioner for their home. Branded for my friends' general contracting and remodeling company, The Chateau Group
Air Conditioner Calculator on Google Play
Air Conditioner Calculator on Google Play
Friday, January 11, 2013
Monday, January 7, 2013
Friday, January 4, 2013
O-H!
Lawrence, I love that you follow my blog. However, it really bothers me that there's a Michigan logo in my sidebar.
Thursday, January 3, 2013
Using Campaign Tracking to Send a Message
I put campaign tracking on nearly every link I sent to the web. It's not just about finding out where your traffic is coming from, but also leaving an impression on other developers that you're sending traffic too.
There's another useful method for this: sending messages to a site that have the potential to be very visible to someone that can make decisions. When you send a message to customer support, it gets mixed in with the general opinion of the user-base and then transferred on to product and marketing. Chances are, your opinion doesn't make it past the support team. However, employees at the company looking at analytics are often employees trying to make practical product decisions based on this data. You hit them directly this way.
Suppose you visit my website, and decide you'd love to tell me how wonderful it is. You could send me a message with:
http://www.andymcsherry.com?utm_source=andy%20admirer&utm_medium=address%20bar&utm_campaign=letting%20people%20know%20they%20rock
There's another useful method for this: sending messages to a site that have the potential to be very visible to someone that can make decisions. When you send a message to customer support, it gets mixed in with the general opinion of the user-base and then transferred on to product and marketing. Chances are, your opinion doesn't make it past the support team. However, employees at the company looking at analytics are often employees trying to make practical product decisions based on this data. You hit them directly this way.
Suppose you visit my website, and decide you'd love to tell me how wonderful it is. You could send me a message with:
http://www.andymcsherry.com?utm_source=andy%20admirer&utm_medium=address%20bar&utm_campaign=letting%20people%20know%20they%20rock
Subscribe to:
Posts (Atom)