I've been hosting sites for myself and for clients since 1995, and over those years have use many different web hosts.  I've had good experiences and bad, and am always surprised when clients come to me having already paid for their hosting but haven't really thought to shop around or considered their needs.  Hopefully, this page will serve as a guide for people that are looking for an appropriate host for their sites at any stage of their site's lifetime.


Assess Your Needs

Before you can shop for a host, you need to consider what the hosting needs of your site might be.  Depending on the size, expected traffic, and technology needs of your site, you will need different hosting.

Estimating Size and Traffic

It may be difficult to estimate size and traffic on a site that doesn't already exist, but you can get a rough estimation.  Don't forget that your CMS will consume some space (for example, WordPress can occupy about 20MB after custom themes are installed), and that the data in your database may count toward your hosting allotment.  

If you publish primarily a text site, you will take up considerably less space than a site that routinely publishes photos and audio.  A video blog can consume not only a significant amount of storage resources, but if the video is meant to be served through a streaming server, it can consume processor resources as well.

Traffic is probably the most difficult thing to estimate for a new site.  Sites that have been running for a while have the benefit of analytics to gauge their traffic.  Unless you know you have a built-in audience for your site, you can probably assume that your traffic needs are fairly low.  You should still take into account the amount of data that your visitors will transfer over their visit, because depending on the host you select, this can factor into your costs.

Technology Needs - Content Management Systems

There are many different Content Management Systems (CMS).  Different CMSes have different server requirements that must be met to have a site that performs well.  If all you are publishing is a set of static HTML pages and photos, then you likely do not need much more than basic web hosting.  If you're running an instance of Drupal, you may need more horsepower in your to generate the pages for serving.

Consider the minimum requirements of your CMS.  For example, if you use WordPress, your host should supply at least the last supported version of PHP, and a current version of MySQL as the database.  Your CMS of choice may require a different set of baselines, including specific versions of Python (or another server-side language) or MongoDB (or another database server).

Types of Hosting

There are several types of hosting, each with pros and cons.  The type of hosting you choose will depend on the needs of your site and your budget.  

The types listed here are not the only types, but if you need hosting other than these, you're very likely at a stage where you'd be better off hiring a contractor to help you decide.

Hosted Services

Hosted Services offer the content management component as the sole offering of the hosted service.

Pros

  • Very cheap for starting sites
  • Simple to start up and use
  • Typically, reasonable support options are available

Cons

  • May not offer flexibility for custom themes
  • Will not provide capabilities for services outside of what they offer
  • Some features (like custom domain names) may have a premium

My Thoughts

If you want a no-frills package, a hosted service is a low-headache option that works well, at the cost of loss of customization.

Examples

Free Hosting

There are many hosts that offer hosting for no cost.

Pros

  • It's free

Cons

  • May be restricted to a certain subset of features that cripple most CMSes
  • There is typically a very low site size limit
  • The servers are often overloaded with free "customers"
  • The servers are typically under-powered
  • Support is nearly impossible to get for free
  • Some hosts delete inactive sites
  • Some hosts force ads onto your pages
  • Practically everything else

My Thoughts

One of the major take-aways about all types of hosting is that you almost always get what you pay for.  In the case of free hosting, you're practically getting exactly what you pay for.  If you choose free hosting, prepare to either abandon your site or move it to a capable host very soon after launch.

Examples

Shared Hosting

Shared hosting is characterized by a service that offers its customers space on a single server that is shared with other customers.

Pros

  • Typically, shared hosting is the most affordable hosting
  • Support is often available by email, message board, or phone
  • Often has a very simple-to-use control panel

Cons

  • Shared resources with others means less resources for your site
  • Shared resources opens potential security issues
  • Overselling

My Thoughts

Shared hosting is probably the most popular self-hosting option because of its low price and capable features. Still, remember that you are sharing this server with other customers, which means that when they get more traffic, your site can slow down.  Also, you're only as secure as the least secured customer on a shared host.  If there's someone running insecure scripts in their site, there's the possibility it can affect yours.  It probably will.

Overselling

Overselling on shared hosting is an important concept to understand. There are many hosting services that offer unlimited storage, bandwidth, and CPU for prices as low as $7 per month. How are they able to offer this rate to everyone? They bank on you not using everything they offer.

If each site only uses 50MB of storage on a server, it's possible to fit hundreds of customers onto a single server, and still offer them unlimited or high capacity.  Some hosts, if you start to use more space than other customers, will move your site to a server that has more capacity.  This can be an inconvenience.  

In a worst-case scenario, a shared host can claim that by using more space or CPU you are violating their Terms of Service by affecting the performance of other customers' sites on that server.  Sometimes, they may send a warning email.  Other times, they may cancel your account without notice.  The lesson here is to be sure to read your host's terms of service before entering an agreement to allow them to host your site. 

Examples

PaaS Hosting

Platform As A Service hosting offers a host for a specific technology platform, but are often unique in the way that you deploy to them.  Examples of a technology platform include Ruby on Rails, or WordPress.

Pros

  • The platform is maintained by the host
  • The platform is usually tuned for performance
  • Allows customization that shared hosts don't

Cons

  • Can be costly
  • Not usually configured for multiple applications
  • Sometimes deployment requires knowledge of SCM

My Thoughts 

Many PaaS offerings require you to know git to deploy code to their servers, which isn't hard but can be daunting if you've only ever used FTP to transfer sites.  Because the platform is managed, it's usually costlier than shared hosting, but it's also usually not as costly as managed VPS or dedicated hosting, and can be a great option if your site uses only the technology that the platform supports.

Examples

VPS/Cloud Hosting

A Virtual Private Server is software that simulates a whole physical server running inside of a larger host server. The large host server can hold many of these virtual servers, which each have been allocated a specific amount of resources of that host server. The virtual servers do not interact with one another.

Pros

  • A full server environment all for your site
  • Can host multiple sites
  • Can usually be quickly changed in size to allow for growth

Cons

  • More expensive
  • Some server maintenance knowledge is usually needed

My Thoughts 

One of the best abilities of VPS servers is that they can easily scale.  If you find that your server is not powerful enough to handle the current load, you can request that your host "resize" your server to provide it more capability.  Be wary, though; many of these hosts charge for metered CPU usage, storage, and transfer.

The obvious downside to a VPS server is that some knowledge of how to run a server is often required, which is typically not the case for shared hosting.  There are hosts available that will provide you with "managed hosting", in which they manage your server for you.  These options are typically much more expensive, but a poorly maintained server is a ripe candidate for security problems, which at best can lead to unwanted downtime for your site, and at worst can be used infect other more-secure machines more easily.  Security is everyone's responsibility!

Examples

Dedicated Hosting

Dedicated hosting is the simplest to explain -- It is a single physical box connected to the internet that you rent to host your site.

Pros

  • Possibly the most powerful hosting option
  • The whole server is yours to do with as you please

Cons

  • Very costly
  • Server maintenance knowledge is usually needed
  • Single point of failure

My Thoughts 

Dedicated hosting can be very useful for sites that have specific back-end requirements.  Unless your blog is doing millions of page views per day, this is probably too much power and maintenance headache for a blog.

Just like the VPS option, there are services that offer managed hosting on dedicated servers, and these have commensurable cost.  Unlike VPS hosting, dedicated servers are difficult to scale.  This is one reason why cloud-based hosting, where the server can be resized and grouped with other hosts, has become more popular for larger sites than dedicated hosting.

Examples

Included Features

There are standard features that you should expect to receive along with your hosting, and many features that hosts offer that you should be aware of beyond simply hosting a site.

Support

I've put this at the top of the list because it's the easiest thing to check, one of the most important things to get when you need it, and the hardest thing when you miss it.  Before you pay a host, call their support line.  Oh, they don't have a support line, you say?  Glad you know that now, aren't you?

Be sure that there is some way to contact a human in the case of a problem.  Know what it costs to have their customer service address issues that you've created, like restoring things form backup or fixing mangled server configurations.  Be aware of whether the host offers a downtime/status monitor at a dedicated location on the web.  (There's nothing worse than a whole host going down and their status page going with it.)  Check out the host's knowledge base to see what kinds of questions their customers typically ask.  This can be enlightening and save a lot of trouble over the long run.

Web Server

Most often the web server used for hosting is going to be Apache, but some hosts offer specialized web servers for the sake of performance or the technology platform used. Knowing whether the web server they provide can handle your site is important if you have a specialized CMS with unique requirements.

Control Panels

There are two types of control panels.  One lets you control your hosting plan as a whole - arrange payment and configure new servers.  The other lets you configure features on individual domains or sites.  cPanel is a popular example of a shared hosting control panel.  Some control panels allow you to deploy software on your site with the push of a button.  These are convenient, but be aware that the control panel may not be updated with the latest version of the deployed software, and you should make sure that your software is fully updated.

Email

Some hosts provide email for your domain as part of the hosting offering, including web access to the email. Usually, the email that comes with the web host is not as feature-rich as found in dedicated email hosts. With the ubiquity of GMail for domains, you often need a specific reason to use email that your host provides.  Still, if needed, host-provided email can be essential.

Logs

Hosting logs are nice to have access to, not so much for analytics (which is now handled better by third-party services) but for access to error reporting.  If something goes wrong on your site, having access to the error logs can be essential in sorting out the problem.  PaaS and Shared Hosts sometimes make these logs difficult to find or obtain.

DNS

Domain names can also be part of your hosting package.  Depending on the configuration of your site, your host may even require that your DNS records are hosted with them.  Be aware of the situation before you buy, especially if your host offers to register the domain on your behalf.

Backups

You are the only person responsible for your site backups, but your host can provide you with the resources to make backups easy.  Even if you are able to create "images" of your whole hosting system in case of emergency, you should see if your host offers any way to store those images (or other backups) in a remote location that is compatible with other platforms.  A host that offers compatibility with other hosts isn't afraid of losing your business, and that's a good sign.

Shell Access

Shell access is something that sounds advanced, but if you're using a server that has it, it can be invaluable for a few reasons.  If you use source code management tools, like subversion or git, having shell access on your server will allow you to use those tools to deploy to the server directly.  

If you use FTP to transfer files to your site, you should stop!  Shell access allows you to securely connect to the server and use SFTP to transfer files securely.  While the files you transfer might not be important to some hacker snooping on your traffic in a coffee shop, the FTP password to your site definitely would be useful for uploading whatever botnet software they like.

Things to Look Out For

Here are some random things that you should look out for or be wary of when choosing a host.

You Get What You Pay For - As I mentioned above, if the price is too good to be true, then there is usually some deep flaw hidden under the surface.  Any hosting that offers a price disproportionate to the features is probably worth skipping.

Get Recommendations - Don't jump immediately at the cheapest host you can find having the features you need.  Be sure to solicit recommendations from peers or user groups so that you know you're dealing with a reputable hosting company.

Spread Your Eggs Around - Many hosts offer one-stop shopping for all your hosting, be it web, email, or DNS.  I recommend that you do not host your site at the same place you register your domain, for a little added security if you need to transfer your site to a different host.

Be Wary of Overselling - As mentioned earlier, one of the easiest ways to get shafted by a host is by using oversold services.  If you choose a shared host, be sure that you get some recommendations from other users so that you know how the host behaves when your site gets the traffic you're hoping for.

Switch Hosts at Whim - After you deploy your site, you'll want to keep regular backups of all your code and data.  This isn't just to mitigate catastrophe, but to make it easy to deploy your site to a new host at a moment's notice if they're not performing to your standards.  With so many competitively priced hosts out there, there is no reason to stick with a host that doesn't provide the features or support you need.  Arrange your backups in such a way that migration to a new host is trivial.

The Admin Matters - There should be a web interface for your interaction with the host, for doing things like initial configuration of the server, submitting support tickets, and payment.  If the web interface is difficult to use, or is riddled with upselling ads, this is often a sign of the kind of thing you can expect from support.

Keep an Eye Out for Lifetime Hosting - Sometimes hosts offer customers the ability to pay one large sum for a lifetime hosting account -- hosting that is ever after free of cost provided that the company remains in business. Usually these accounts are pretty low on the support attention span, but if you can live with that for a simple site, it's a reasonable way to get hosting cheaply from an otherwise reputable host.

Server Farm photo by Simon Law

You know that permission page, with all the checkboxes? Wouldn't it be nice if you could drag a bounding box around a bunch of those checkboxes to toggle all of the selected checkboxes? Well, since you asked so nicely...
I created a script, dps.js, and a bookmarklet that will do just that!
Step 1: Drag this link to your bookmarks toolbar to create a bookmarklet:
Drupal Permissions Script
Step 2: Visit your Drupal permissions page.
Step 3: Click the bookmarklet that you created. Nothing will seem to happen.
Step 4: Then do this:

Unable to display content. Adobe Flash is required.

Another side-effect of this bookmarklet is that the column areas that surround the checkboxes become hit-targets for the checkboxes they're near. So you can miss the stupid tiny things slightly and still toggle the box.
Here's the official repo, to which you are welcome to submit patches: https://github.com/ringmaster/serverscripts
Enjoy!

Quite a while ago, I started work on a Reversi game, mostly to see if I could challenge my friend who is a pretty strong Reversi player. I don't have much experience with this sort of task, as it's not the sort of thing on which I generally work, but it was a nice change of pace. While it was new to me, this IS a pretty well-trod area... I never expected any revelations or unique developments. Certainly, if you expect to find something unavailable elsewhere here, you are mistaken, but if you want a quick look into some interesting challenges, mostly performance-related challenges, hopefully this entry (and related entries) will be interesting. I'll be breaking it into a number of posts:

  1. Basic introduction to Reversi AI
  2. High-level performance considerations
  3. Large-scale optimizations
  4. Search tree algorithms
  5. Small-scale optimizations
  6. Wrap up and Resources

I suppose first I'll give an introduction to Reversi. You can find plenty of descriptions online, but for the lazy among us, here's a breakdown: Reversi is a 2 player game played on an 8 by 8 grid of equal-sized squares. Each player has their own unique color (traditionally black and white). The pieces are flat and cylindrical, with top and bottom sides solidly colored one for each player. Pieces are placed centered in the squares on the board and the top color of the piece indicates which player owns that square (initially this must match the player that played the piece). The starting configuration for the board has the 4 middle squares filled by pieces, alternating white-black on the upper middle row, then black-white on the lower middle row. Traditionally, play starts with the black player. Players take turns placing a piece into a valid position until neither player has a valid move. If a player doesn't have a valid move, they pass, but otherwise they must place a piece (players cannot opt to pass). A move is valid if it results in at least one of the opponent's pieces flipping. This happens when, horizontally, vertically, or diagonally, a line can be drawn from the move's position through 1 or more of the opponent's pieces and then to another of the current player's pieces without going through an empty square or piece owned by the current player. This sort of sandwiching of the opponent's pieces with the current player's pieces is called "flanking". When a piece is played, all of the flanked opponent's pieces (in every direction) are flipped to be owned by the player who just placed the piece. The winner is the player who has the most pieces at the end of the game.

One area that was particularly new to me was the development of artificial intelligence for the computer players. Like many pieces of technology, it seemed as if it would be wondrous: Neural nets and/or complex logical rules. Indeed, if you look at traditional human strategic elements, you will see mention of things like parity, stable pieces, and balanced edges. Somehow, these ideas must be combined together to produce an incredibly intelligent player.

Sadly, like many magic tricks, the wonder dissolved once the secret was revealed: Try to think as far ahead as possible. By simulating every possible move from both sides, the AI is able to look ahead a number of moves. Each move it looks ahead is called a "ply". The further ahead the computer can calculate (i.e. the more ply), the better a move it is likely to make. If the computer is able to calculate to the end of the game, then it will know if it will win or lose and the sequence of moves to get there.

What's important to understand at this point is that Reversi is a relatively deep game, in that there are many turns per game (usually 30 per side). Also, to know who won the game, you need to get to the end of the game and then it's down to a simple question: Who has more pieces on the board? In the ideal case, the computer could solve the game all the way to the end, trying every possible move, and then it could know which move would be the best move. Unfortunately, that's not really feasible with the current technology; the number of possible games is extremely large as there are usually many potential moves per board position... on average about 7.488, according to my estimation. This grows exponentially with each move... it doesn't take too many ply before it takes more than a day to compute all the possible board outcomes. From this, we can conclude a few things:

  1. We need a way to deal with the fact that we're not going to be able to solve to the end of the game
  2. Performance will be pretty important

Dealing with uncertainty

Given the computational needs of completely solving Reversi, it is pretty obvious that the AI has to deal with the fact that completely solving Reversi is not going to happen. Generally the method used is to play the game as deeply as we have time for, and then come up with an accurate estimate of how good the resulting board is. In an ideal world, this estimate would indicate how likely that board is to lead to a win for the current player. This depth will eventually reach the end states of the game and, when that happens, the estimation can be replaced by certainty.

This estimation may take in any number of different calculations and combine them in an unlimited number of ways. Generally, it seems that the most common way is to use linear combination to combine the different metrics together into the final estimate. This solution is quick, easy, and provides good results. Linear combination involves combining a number of features, like "available moves" or "pieces on board", applying a weight to them, and then adding them together. There is no limit on the number of terms that can be combined in this way or what the weights may be. A simple (arbitrary and poor) example of this sort of algorithm: board_strength = 0.3 * stable_pieces - 0.1 * number_of_valid_moves. In this example, the algorithm is trying to maximize the number of stable pieces and reduce the number of valid moves. Because of the weighting, an increase of 1 stable piece offsets the increase of 3 valid moves.

Possibilities range from choosing a single metric, to using machine learning to discover a superior weighting system. If you're interested in the latter, I found this lecture on Autonomous Derivation interesting, if a bit much to dive into without preparation. You could probably also just search for the ideas covered there, like linear regression and gradient descent, and find something useful.

I'm not really going to discuss the specifics of what metrics should be used or how they should be weighted, but I WILL say that this evaluation function must be fast, since it is one of the most common calculations the AI will make.

Next post, I'll talk about some of the philosophical concepts of performance and look at difference kinds of optimizations.

I was complaining yesterday about the login process that Contenture uses on its member sites to gain visitors. Rather than just complain, I thought I would give some constructive response to their request for feedback, which is coincidentally one of the topic areas that I like to study on this site.

Doing It Right vs. What We Actually Do

We developers talk a lot about how a project should be done "right", and hardly ever are they done that way. There is a very wide gap between what we should do (the stuff they make you take classes on in college, but you never do in real life) and what we actually do (the stuff we end up doing to make things happen in time and budget, but nobody ever bothers to teach in a class). I think that every developer cuts corners on either end, some more on one side than another, but there is someplace in the middle that makes it work. I like to think of this area as the back of the napkin.

Attached to this post is a flash applet showing something I drew depicting my impression of the Contenture login process. We've basically got two types of site, and three types of visitor. The first type of site is the content producer, someone who uses Contenture to remove ads and potentially display premium content to Contenture members. The other type of site is the affectionately named, "Affiliate Whore", whose only current business with Contenture is to make a quick buck from a new user signup.

There are three types of visitors to either of these sites:

  1. Visitors who are not Contenture members
  2. Visitors who are Contenture members but are not logged in
  3. Visitors who are Contenture members that are signed in

Of the three, the last is the easiest to figure out. The affiliate site doesn't really need to care who you are - it can display an affiliate link whether you're a member or not. The content producer will probably need to know that the visitor is a member and then display appropriate content for that user. That part is actually pretty easy with the way things are now.

User Stories

For visitors that are not Contenture members, both sites need a way to tell these visitors that they should sign up. When they choose to do so, they would be taken to a page that allows them to register for the program. Optionally, when the sign-up process is complete, they could be returned to the originating page. The affiliate-only site isn't going to care too much about the return, but the content producer is. An instant return visit from an new registered user guarantees at least one content-based hit on the site for the purposes of tracking payment to network members, of which that site will be one.

For visitors that are Contenture members but are not logged in - which happens in the case of users who roam between computers, use multiple browsers, or are developers that dump cookies every few hours - things are a little trickier. A site can't know that the user is a member if they're not logged in. They'd probably need to see a standard-looking "this site uses Contenture" banner/button/logo somewhere that they could click to get the standard login.

My premise is that the same link that you use the same affiliate sign-up link to go to a page that is both the new user sign-up form and the existing user login form. Even if the "join our program" link on the content publisher's site doesn't say "Contenture" in it (even though it should, for reasons I'll mention in a minute), they'll still arrive at the Contenture login page and will respond accordingly.

The Affiliate Link

The "join our program" link should definitely say "Contenture" somewhere in it. The reason is that if a visitor encounters a site that has a program described only as "Pay $X to avoid ads and obtain premium content", then he's already considering the money he has paid to Contenture and wondering if it's worth it to join a new site. Whereas if the banner/button says "Contenture" on it somewhere, the choice to continue with the login for an existing user is not an issue.

contenture_banner.pngAlso, my thoughts on what the button should say lean more toward providing an advantage to the visitor. So unless it's the only benefit, it shouldn't say "Support This Site". Supporting a site monetarily is an obvious benefit of paying for premuim content. What it should say is what the visitor gets from the site in exchange for signing up. My thoughts resulted in something to the effect of "Fewer ads and more premium content -- support this site with Contenture".