Running Docker on Debian 7 (Stable)

I’ve been wanting to get Docker running for some time. The only remaininhurdle was the age of the Debian’s stock kernel, which was too old to support the Linux container infrastructure. Now that the kernel has been updated, installing docker is simple. 

Add the Debian “testing” repository to your APT sources.list (and strongly recommended, though optional, pin it to prevent unnecessary upgrades):

# Install the Debian testing sources.
cat > /etc/apt/sources.list.d/debian-testing <<EOF
deb http://ftp.debian.org/debian testing main
deb-src http://ftp.debian.org/debian testing main
EOF
# Optionally pin debian-testing to -100.
cat > /etc/apt/preferences.d/debian-testing <<EOF
Package: *
Pin: release a=testing
Pin-Priority: -100
EOF
# Update APT sources.
apt-get update
# Install Docker.
apt-get install docker.io/testing

Done.

Using RDoc with Ruby C Extensions

While building a native C extension for Ruby, I discovered that RDoc will occasionally refuse to document valid code and emit no particular warnings or error messages when it chooses to do so. If the class definitions exist across multiple C sources, RDoc will skip parsing documentation if it cannot locate any documentation for the dependent classes or modules.

This issue may show up if you have multiple classes defined under a module, and each class and module in a separate C source file. RDoc has no knowledge of the structure of C programs (nor the ability to use the C pre-processor) so it isn’t capable of understanding some valid constructs and it requires a kick in the pants to do so.

In your extconf.rb, you will need to add a compiler flag before the call to create_makefile:

$defs.push('-DRDOC_CAN_PARSE_DOCUMENTATION=0')

This adds a pre-processor definition of RDOC_CAN_PARSE_DOCUMENTATION that evaluates to zero. Then, in the initialization function for each class, you need to add a bit of dummy code before RDoc will understand:

void Init_MyExtension_MyModule_MyClass(void)
{
#if RDOC_CAN_PARSE_DOCUMENTATION
    mMyModule = rb_define_module("MyModule");
#endif
    cMyClass = rb_define_class_under(mMyModule, "MyClass", rb_cObject);
}

Now, RDoc will see the module definition and parse documentation as usual while the C pre-processor will omit this code from the compilation unit. It’s a bit of a hack, but it’s needed in order to completely document the extension.

Git Aliases for SVN Repositories

I use Git often but I’ve also encountered SVN repositories that I would like to use. Git-SVN provides the low-level integration with SVN but the steps required to set it up can be a hassle. Because of this, I’ve created some Git aliases to assist in speeding things up. (Yes, it’s a hack.)

# ~/.gitconfig
[alias]
    rebase-svn = svn rebase -q
    commit-svn = svn dcommit -q
    clone-svn  = svn clone -q --prefix=svn/ -s
    track-svn  = branch svn-sync svn/trunk
    setup-svn  = !sh -c 'git clone-svn \"$1\" \"$PWD\" && git track-svn' -

Now I can initialize a currently existing Git repository to track an SVN repository as well.

# Create a Git repository for testing.
$ git init test-git-repository && cd test-git-repository
Initialized empty Git repository in /home/user/test-git-repository/.git/

$ git setup-svn svn+ssh:///hostname.domain.tld/path/to/repository/
$ git branch
* master
  svn-sync

# When you're ready to send your commits to master upstream...
$ git checkout svn-sync
$ git rebase-svn master
$ git rebase-svn svn/trunk
$ git commit-svn
Committing to file:///home/user/test-svn-repository/trunk ...
...

The “svn-sync” branch will track the remote SVN repository. You get all of the benefits of Git’s decentralized model — like a full copy of the remote repository and history — while still retaining the ability to commit to the upstream SVN repository.

Automatically Rewrite Git Push/Pull URLs

Git can automatically rewrite a repository URL if it matches any filter specified in it’s configuration file. These filters operate during both sending and receiving. The following addition to your global “.gitconfig” file will convert all HTTP(s) repository URLs, as well as those prefixed with “github:” to use the Git URL scheme during push and pull operations:

# ~/.gitconfig
[url "git://github.com/"]
    insteadOf = https://github.com/
    insteadOf = http://github.com/
    insteadOf = github:

[url "git+ssh://github.com/"]
    pushInsteadOf = https://github.com/
    pushInsteadOf = http://github.com/
    pushInsteadOf = git://github.com/

Now you can clone GitHub repositories using the “github” prefix:

$ git clone github:username/repository
Cloning into directory ...
Receiving objects: 100% (1/1), done.
$

Git will now rewrite any pull to Github to use the bare Git protocol and push to Github to the SSH-secured repository address.

Hot Patching: An Update to “Unconditional Jumps on x86 and x86-64″

This morning when I checked Reddit’s programming board, I noticed an article that stood out because it had linked to one of my earlier posts about hot-patching. As one reddit user brought up in the comments (referring to the linked article):

Unfortunately, this article doesn’t discuss how to do the hard part — possibly calling back the original function after you modified it. This involves writing a trampoline with a modified version of the function preamble that was overwritten, fixed up so PC-relative instructions are still correct in the (relocated) trampoline. This is a popular method on jailbroken iOS devices that MobileSubstrate performs.

I’d wanted to write an entry about that exact process and the problems that appear when trying to complete it, so I thought tonight I would do exactly that. The same platform caveats of the previous article still apply. I should also add that I’ll be focusing on only the ELF format of binary, though there are others in use by different operating systems.

To make a long story short, the reason this part is relatively complex (and part of the reason I left it out of the previous article) is that it involves reliably decoding the x86 and/or x86-64 instruction set. To put it lightly, decoding the instruction set is an extremely complex task itself (and rife with its own problems) so I’d normally rely on a disassembler library like udis86 or diStorm64. Each has its own API and its own way of doing things, so I’ll leave using them as an exercise to the reader.

The first case is the simplest: we’re about to place an unconditional jump at the prologue to a function, just as we were going to before. We need to calculate the total number of bytes that our replacement opcodes will use.

For a direct unconditional jump, it’s the size of the direct jump opcode (direct jumps are single-byte opcodes) plus it’s displacement address (four bytes, to a maximum offset of 2GB in either direction.) For the indirect unconditional jump it’s the size of the encoded movabs opcode, it’s target register, the 64 bit offset we want to jump to, and the encoded indirect-unconditional jump opcode, of which all add up to 13 bytes in total.

Now we need to start up our disassembler at the function entry point (or wherever we are placing the detour patch). Point it at the right offset and begin disassembling whole instructions until we’ve reached a point equal to or greater than the total number of bytes we need to overwrite. We want to find the least number of whole instructions that we can overwrite so that we don’t leave half-written opcodes dangling.

Next, allocate a chunk of memory in the process equal to the total size of the disassembled instructions and copy the source’s bytes directly into it. This is our backup location, storing an exact copy of the bytes we are going to overwrite. Now we can overwrite the target as usual. Remember to write NOP (0x90) instructions immediately after the detour, so that we don’t leave dangling opcodes behind. Now our detour is in place. If we want to restore the default functionality, we copy the source bytes we backed up over-top of the detour.

Now the hard part: how to transfer control flow back to the original function, without overwriting our detour. This is where things get very complex, very fast. The path taken forks at the architecture we’re using. I’ll start with the x86 and move onto it’s newer cousin in a bit.

Short and sweet; we have to disassemble the previously backed-up bytes, one at a time, so that all hard-coded addresses and offsets (including implicit ones) to adjust for their new location. Since we don’t want to directly overwrite our backup instructions, we have to make an extra copy of them somewhere else, piece by piece. (Hint: state machines are great for this task.) Because the x86 doesn’t have a way to get the current value of the instruction pointer without modifying the stack (not one that I know of anyway), we have to make some educated guesses about what is going to happen “next” from the point-of-view of the executing code, similarly to how we computed the offset-displacement for the direct jump. Any instructions that attempt to get access to addresses or offsets that are outside of the small block of instructions we backed-up need a rewrite.

This includes any control transfer instructions to other areas of code, any calls to other functions, the global offset table (GOT) or the procedure linkage table (PLT) for the ELF executable, any calls to __i686.get_pc_thunk.bx (which loads the %ebx register with the current instruction pointer, post-instruction) for code compiled on the x86 as position-independent (many shared libraries and executables are), and the occasional changes to the stack that would use or manipulate hard-coded offsets or addresses. This list also changes depending on the architecture that you’re hot-patching for.

As for the x86-64, many of the ABI changes alter the instructions you need to rewrite. Some are simpler, and some are not. Because of the expanded address space, you need to account for the offsets that the backed-up instructions reside at, and either deliberately place the executable ones below the 2GB maximum offset, or use indirect-unconditional jumps to move control flow from one to the other. The former is safer for many things because the latter has to alter registers in most cases. Beware the issues at the boundary of the 2GB offset. It can trip up a lot of automated rewriting on x86-64. Additionally, unlike the horrid mess that is “__i686.get_pc_thunk.bx” on x86, the x86-64 uses %rip-relative addressing which means that instead of using stack pushing and popping tricks to get the instruction pointer, you can just use the instruction pointer directly.

These lists include only the basics. Depending on the code you’re reverse engineering, it’s possible to have many other combinations of instructions that would need rewriting. Any rewritten instructions must fulfill their contracts. This means that, to the executing process, the control transfers and voodoo magic we’re doing are entirely transparent and that it would function as normal if the detours didn’t exist. If it isn’t, you’ll get some weird bugs that will take ages to track down.

At the end of the instruction rewriting, place the proper jump to jump back to the address of the NOP instructions or the instructions following them. When you’re ready to give it a shot, import the address of the beginning of the rewritten instructions into your C program as a function pointer of a type matching the prototype of the function you want to detour. Also make sure that you’re using a similar compiler and compiler options. If you can’t get either of those, either write some glue logic to get the same result with a few more instructions, or try harder. Creating more code won’t make up for the lack of an attempt.

If you’re wondering why this isn’t done more often, it’s usually because there are a lot of x86 disassemblers and not many x86 assemblers that have a usable API. It’s likely you’ll have to tack that onto a new library. If I get some time in the future, I may write a library to do that, but for the mean time the best opcode-research tool is the debugger and disassembler.

As you can see, once you understand the code from the point-of-view of the processor, rewriting the code while “hot” is fairly easy. If there’s interest, I’ll write some code here and there in this post to make things clearer. If you have any questions, leave a comment below and I’ll try to respond.

Debunking SEO

SEO — otherwise known as Search Engine Optimization — is the process of improving a website so that search engines can more easily index relevant content with the goal of improving how soon it will show up in a search engine’s index. Online (and sometimes offline) contractors abuse the term “search engine optimization” to sell a service to improve SEO ratings. About ninety-eight percent of those so-called contractors — and likely more — are quacks. Companies can waste significant amounts of both time and money on these contractors to improve their website’s SEO ratings. If you’re the business, unfortunately, that contractor is hoping to fool you into paying him for a rain-dance.

Let me re-iterate if you missed the last paragraph: in the majority of cases, these SEO services are a pile of crap. If you’re an executive paying someone to improve your company’s website SEO, I can almost guarantee you’re wasting money that you could be spending on your employees. Most of these contractors know they’re selling you a rain dance and use that as a business opportunity to sap you for as much money as they possibly can.

One half of a search engine generates an index of all the websites that it has found on the Internet. The other half is what the user primarily interacts with: the user types in a set of keywords (a query) and the search engine tries to find relevant information in its indexes related to that set of keywords (a set of results). This creates a bottleneck: a search engine is a gateway for websites traffic. SEO, as we’ve discussed earlier, is the process of optimizing a website so a search engine can find more relevant results for a user’s query.

They key to optimizing a website then — as trite and obvious as it seems — is through relevant and useful content.

That easy, you say? Well, not really, no.

Machines generally have an extremely hard time trying to understand what is and isn’t relevant to users. They can do massive amounts of statistical analysis on the data that they collect but that takes a significant amount of time and most users are impatient. Most modern search engines use other relevant metrics: for example, from the search engine’s point of view, if a user clicks on a result and then clicks the “back” button in their browser, the search engine will see that the user has immediately requested the results page again and notice that the website the user viewed was not useful to them. There are other metrics that search engines use, but the overall message is that a search engine’s results reflect the usefulness of content to a user. If a user doesn’t like what they see and they leave the website, not only will they likely never come back, but the search engine will notice that as well, compounding the effect of useless and irrelevant content.

There are many metrics but it all boils down to a bit of common-sense and a bit of understanding that your users are more than countable business goals. You might have some default blame-reducing phrase like “That’s not a problem for us! We have good content!” and your SEO-contractor will likely agree with you. The fact is, however, that if your statistics say your content is useless or irrelevant to your users, no amount of paying a contractor to improve your SEO will solve the problem. If your users find your website in any way irrelevant or useless for their direct and immediate needs, they will leave and they will never come back.

So, it’s fairly clear that it’s important for users to see relevant and useful content. The prerequisite for that is to understand who your users actually are. If are targeting everyone indiscriminately and you’re not Facebook or some other Web 2.0 social media big-shot, you’re heading for a world of hurt. Take a step outside of your business mindset and think about all the people who come to your website. Put yourself in their shoes. What purpose do they have when they visit your website? What should they come to you for? What do they expect to see when they hit the first landing page? You can’t target your content for an undefined market segment and expect to succeed.

At this point you’re probably asking, “Okay, this is all great information, but what can we do to fix it?”.

That’s a tough question and it the answer, like so many things in this world, is that it depends. I don’t have a degree in marketing or business administration, yet if I know your business, I can definitely tell you some ways your content is failing to deliver and usually within reading one or two pages. Again, take a step outside of your business mindset and hear me out.

Your website likely isn’t delivering what you expected from it because the content on your website is broken. By broken, it means large numbers of your visitors view one page, and the close your website in their browser. Your users don’t want to read what you have to say, and stuffing more keywords, semantic tags, images, “alt” attributes, and flash media into the page won’t fix the problem they have. Improve the experience of your users on your website by adding real, useful, content. What is useful to your users is completely dependent on who you’re targeting. Real content targeted at real people tends to deliver real results.

Most people visit a website to answer a question that they have. That question could be an answer to a number of  their problems, or they could be seeking advice about a subject or information about an industry, or they are looking to buy a product, in that order. If you can’t immediately answer that question they came to you for, they will leave. Mission failed. Most corporate websites focus exclusively on only the last of those three segments to the exclusion of all else. They likely see it as the quickest return of investment: it’s a quick, fast, and easy way to make a buck from their online presence. That short-sighted thinking means that when that last segment they exclusively targeted evaporates — it quickly will when search engines rank the website poorly — the website becomes costly dead weight instead of an active revenue stream.

Many websites also suffer from ego-driven development. The most visible symptom of EDD is company-centric, egoistic website content. Websites suffering from it usually talk about “our commitment”, “our products” or “our services”. See the common word there? It shifts the focus from the visitor back to the company itself. It might make the pointy-haired boss happy to see it but it provides absolutely no value whatsoever to the visitor. Many normally savvy business owners tend to believe that marketing and sales copy is all about their business but absolutely nothing could be father from the truth. Your visitors don’t care about your company. They only care about what the company can do for them right now.

Many websites also like to list lots of facts about their products. As any marketer (or any behavioral psychologist) worth their salt will tell you: people react to emotional stimuli and then rationalize the feelings with facts afterward. Facts are great for post-purchase rationalization, but they don’t connect with people on an initial emotional level. Save the facts for the flat-sheets and technical documentation. Tell people about the benefits that your product provides, and show how it compares to alternatives provided by others in the community.

Often a company will list the facts given to them by an expert in their company, without thinking about how those facts relate to the visitor. The technical people that were supposed to recommend your product will never see the website because the executive that wanted to demo the product got scared away by the information overload. This isn’t to say that you should strip all factual content from your website. On the contrary, facts increase credibility and authority, but listing the product’s benefits increase the number of customers because it appeals to their own self-interest.

The greatest way to improve your SEO isn’t to hire an outside contractor to perform some magic voodoo. Instead, target a defined market segment as a whole, solve a problem they have, and improve your content to be more relevant to your users. There’s an analogy from SEO that applies here: website content is like a good book. A search engine is merely a scout for massive publishing house (website visitors) to find the good writers. If no one wants to read your book because it doesn’t answer the questions they have, then it won’t get published. Yes, you can write a masterpiece vetted by every expert you have on staff, but it won’t matter if it doesn’t get published by the publishing house. Mission failed.

How does this all relate back to SEO contractors? In short, most of those so-called experts are not actually experts at all. They know how to manipulate statistics in Google Analytics to temporarily give you a rating boost that seems positive, but only for a premium fee. It’ll give your company a temporary ego boost and may attract some temporary visitors but it won’t increase your revenue in the long-term or help to create a community around your products. In short, these contractors are gaming you for your money and likely selling you bogus.

Don’t waste your money on SEO contractors. Spend it building useful, customer-focused content that will keep visitors and attract all segments of the market. The more reasons for people to stay on your website, and the more positive traffic you can keep, the less important SEO will be, and the higher your search ratings will be.

Delimited Continuations and the Call Stack

When a program invokes a subroutine it needs to know where to go when it’s finished.

For example, when a function call happens on the x86, the caller pushes a number of arguments onto the stack and then calls the subroutine, which sets up the stack as it needs. When the caller executes the ‘call’ instruction, the processor pushes a return address onto the stack and then transfers control to the subroutine. When the subroutine exits, it transfers control back to that return address and the program invokes the next set of instructions. If you analyze the stack in a debugger you can easily see each of these stack frames (or activation records if you’re a pedantic academic type). All of these records, composed one inside another, form the entire call stack.

Each of those records represents not only an execution history of where the program has been before but also the execution future or where the program intends to finish. At any point in a computation the snapshot state both of the call stack and the processor’s registers represent the entire history of the program from start to finish. See where I’m going with this?

A continuation represents the state of a program at a given point in its execution. That state is a small package then saved somewhere else and invoked like a subroutine later on. When the program invokes that continuation the continuation’s contents replace the current execution context effectively rewriting the program’s history. When the program invokes that continuation, it copies the entire saved state of the stack and the registers captured, which replace the entire program history reverting it back to the state it originally had when the continuation was first captured. Therefore the original function can never return to its caller because it’s original state of execution no longer exists. This is obvious in the context of continuation-passing style, because in that style no subroutine can ever return to its caller because it passes its own state as an argument to the explicitly provided continuation argument. In assembly, a subroutine call then boils down to a simple control-transfer function like jump. It’s a vast simplification of the typical execution scheme that requires the programmer to explicitly direct execution flow at a higher level instead of relying on the implicit setup of a function call.

A delimited continuation is a special case of a full continuation. Instead of replacing the entire call stack, it only replaces a specific number of stack frames and only rewrites the execution history of the program up to a specific point. This is what makes call-with-current-continuation so powerful. It gives the programmer access to freely rewrite the execution history of a program to do whatever he or she wants at any given point. This is also why exceptions are a special case of delimited continuations. When thrown, an exception reverts the stack by a number of frames until a calling subroutine in the program handles the exception. It replaces the execution context so that the program handles any errors immediately instead of generating a return code that requires the subroutine to finish.

How does this relate to the call stack? It’s simple enough to answer that the nested stack frames are an implicit form of post-hoc continuation passing. The compiler translates the program code into a stripped down transfer of state across subroutine calls and the return address (and by extension the contents of the stack at that point) represents the current continuation of that stack frame to the rest of the program.