Using Geolocation and GeoIP and in web applications08/18/2011
The Internet has evolved into a collection of resources. Some of these resources apply to a local audience and others to a much wider populous. Don’t get me wrong, the seemingly endless mass of information is a jovial tool—truly a world changer—but it has also diluted the importance of localization and personalization. Some of our clients are in the dilemma of providing local goods/service and also needing service a national presence. How do you serve both audiences from the same web site?
Take for example a medical practice wanting to serve local patients suffering from hair loss in metro code #636—Harlingen-Weslaco-Brownsville-McAllen Texas. The practice wants to appear to be local because this is to their advantage in the Harlingen, Texas area. In our example, the medical clinic earns 60% of their revenue from selling an organic topical cream to stimulate hair growth. E-commerce sales come from all across the United States.
The company spends about $48,000 per month via Paid Search Marketing (Google & Bing pay-per-click advertising). They pitch making an appointment at the office for local prospects and video/e-commerce for the national audience. Making matters worse, the medical practices needs to sell on thinner margins nationaly and higher margins in the local market to cover the cost associated with a higher level of service. Selling both local and national on the same website presents problems because neither is best optimized for customer acquisition and conversion. This usually manifests in poorly performing PPC campaigns or a high cost per acquisition metric. Using different URLs is usually not an option due to cost and duplicative content/SEO concerns.
As an internet consulting and full-service ad firm, we often deal with these sorts of issues. Companies are looking for clientele on a global level and still want to accommodate local clients. Our solution is GeoIP or geolocation by internet protocol address (IP address). Using GeoIP, clients can reach out to prospects on a global level and still tailor services to prospects, members, customers or patients who are within driving distance.
There are a handful of solution providers offer geolocation by IP services. Most offer a hosted webservice on a subscription plan tied to a number of data queries (vendors like Quova, SiteBrand, Google, and ISAPILab’s Country Locator). These services tend to be very expensive for high volume websites. With a hosted solution, a XMLHTTPRequest object can be used to send IP address data to the vendor and get back location data. This all happens under the hood of a website, typically in the global.asax or tracking manager scripts (we're using ASP.NET here).
There are four credible sources to acquire raw data. MaxMind and IP2Location tend to be the largest players in the data market space. In our testing, HostIP (a free data provider) seems to be less reliable. Mining the likes of InterNic is very time intensive and is not practical. The commercial data providers have advanced techniques to scrub publicly available data and tend to be more accurate. We tend to deploy custom solutions using data provided by MaxMind. Data costs around $1k per year including updates.
How it works
When a person visits a website, code embedded in the website references a database and determines a users’ country, region, city, postal code, area code, company… Furthermore, GeoIP can provide information such as longitude/latitude, connection speed, ISP, company name, domain name, and whether the IP address is an anonymous proxy or satellite provider.
The notion behind GeoIP is simple. We deploy a data set consisting of:
- IP Numbers
- US/CA Regions
- US Zip Codes
- US Area Codes
- US Metro Codes
The dataset is imported into a series of SQL database tables in a database. The website converts a user’s IP address (accessible from most any website) to an integer using a proprietary formula. In our case we use a custom .NET class that we at D3 developed long ago. The integer is looked up in our database and the resulting recordset will contain Country along with additional information listed above. This .NET class is added to the sites tracking manager such that each website visitor will carry a session which contains the country code, state...
The website is augmented, most often using
In our example above, the home page typifies a holistic medical practice inside metro code 636. Outside metro code 636, the website appears to be a national leader in holistic hair loss. Graphics, navigation and text, calls to action… flip depending on the geography. The solution is elegant and very powerful. It is also fairly simple to implement. The end result is serving two audiences on a single website. Hands down, the technique provides greater conversion and lower acquisition costs. The only real issue is that the search engine typically indexes the site as a national player. Local paid marketing and attention to local SEO techniques mitigate any real problems.
A deeper look at MaxMind
Let’s drill a bit deeper for those technical folks. In order to determine the location of a unique visitors IP address needs to be cross referenced in our database. The IP numbers supplied by our vendor have a beginning and ending IP range. This is done to keep the database optimized to under 6 million rows. When a unique visitor’s IP is being cross referenced, it is being checked by the database to see if it is within an ‘IP range’. The beginning IP number and ending IP number are set as whole numbers and not the common IP structure with octets (ie. 192.168.1.1) for performance reasons (i.e. integers are easier to lookup than nvarchars). We convert an IP into an integer using the following calculation:
IP address number = 16777216*w + 65536*x + 256*y + z (1)
IP Address = w.x.y.z
The reverse of this formula is
w = int ( ipnum / 16777216 ) % 256;
x = int ( ipnum / 65536 ) % 256;
y = int ( ipnum / 256 ) % 256;
z = int ( ipnum ) % 256;
Where % is the modulus operator.
Once the ip number has been converted to a whole number, it can be easily looked-up from the database using a stored procedure.
If you’d like to know more about how we use GeoIP technology, give me a call or send me an e-mail. I’d be happy to provide even greater insight. There are really interesting areas that need consideration. Things like:
- How do different search engine spiders respond to GeoIP?
- Does Google/Yahoo/MSN see this as a ‘black-hat’ technique?
- What to do about Canadian AOL user identification problems.
- Accuracy - 99.8% accurate on a country level, 90% accurate on a state level, and 83% accurate for the US within a 25 mile radius.
- IP version 6 IP addresses.
- How to handle anonymous proxies.
- Satellite provider identification issues (mostly Africa and the Middle East).
- Circumstances where there are no clear indications as to where the end users are located. For example, we may mark a corporate proxy located in London, GB as "EU" if we can see that the ultimate end users are connecting from various parts of Europe.
- What to do about changes - 1-2% of the GeoIP database changes each month.
We sometimes use GeoIP to adjust pricing by geography. Richer areas pay more. Spaces with more competition pay less. New markets get different messaging… With GeoIP the possibilities are almost endless.
We hope that this post has opened your eyes to what is possible with GeoIP.