What's The Diff?

This seems like a very simple trick, but it's one that I've found extremely useful: generating a list of every file on a remote server that's not the same as the copy on my local machine.

As part of my workflow I always keep an entire working copy of any websites I work on, databases included, on my development machine. This allows me work on the site without fear of breaking the live site. When I'm happy with the way my local copy is working I can then copy any files I've changed up to the live server via SFTP.

At this point I'd really like a list of exactly which files have changed, either on my local copy or on the live site. The command line tool diff is perfect for this kind of job, but not so good if you only have SFTP access to the live site. The answer is to use SSHFS and FUSE to mount your remote folder onto your desktop. Then you can use any tool or application on your local machine against that mounted folder, exactly as if it were sitting on your local computer.

There are great tools that give you a nice GUI for SSHFS but I already use Panic's Transmit to connect with FTP servers, and that comes with a "Mount favorite as disk" button that just works. So once you have your live site sitting on your desktop, you can then run diff commands like the following, which will list every file that is not the same in two folders.

diff -rq /my/local/copy/ /Volumes/example.com/public_html/

(Yes, I know Transmit has a "Synchronise" feature that purports to do this same task, but unfortunately it uses the file modified times to decide if something has changed, and at least on my servers, the remote system times frequently get out of sync with my desktop, causing Transmit to report that every single file is different. The diff tool on the other hand compares the contents of each file, which is a much more robust approach.)


permalink | Tags: tools.

Transparent Content Filtering for Web Developers

Imagine you want to play around with some JavaScript resource that is installed on a client's live server. You might want to do this in order to test some changes you've made to that file: it may work on your development server, but you want to see if that will still be true when it goes live.

This can be tricky. You definitely don't want to upload and pray. Maybe some specific configuration that is only on the live server will cause your AJAX features to behave in unexpected ways? How could see those errors for yourself without anyone else seeing them at the same time?

One way is to use a content-filtering proxy. This is basically an application that sits between your web browser and the your internet connection. Any request that browser makes will go through the proxy and any response from the internet will come back through that same proxy. Once you configure this, it's invisible to the browser. So, to accomplish what we need we could configure the proxy to replace any SCRIPT tags in the page on returned from the live server with a different SCRIPT tag that points to the new file on your development server. Everything else will be exactly as it is on the live server, because the page you will see will be coming from the live server. Only you, thanks to your proxy, will see the affects of the new JavaScript.

To illustrate I'm going to show how I could use this technique to insert my own picture into my view of the New York Times Technology Page. Niether the browser nor the server will know that anything has been changed, only I will see the alteration. Naturally it's possible to change anything on the page, I'm changing an image only because it's easy to see.

To accomplish this you'll need two prerequisites: You'll need to be able to install a proxy server, and you'll need to be able to write Perl-style regular expressions.

Let's start by installing a proxy server. I'm on a Mac, but you may need to adjust this for your own operating system. I've decided to use Privoxy because it's free and works on a wide range of computers. The easiest way to install Privoxy on Mac is to use the Darwin Ports package manager.

$ sudo port install privoxy

Once that completes, you will, by default, have a privoxy binary installed in /opt/local/sbin/privoxy. Before you start it, I would suggest a couple tweaks to the configuration file.

$ sudo vi /opt/local/etc/privoxy/config

We want to tell privoxy to use our own personal files as part of the configuration, this will make it easier and safer to tweak the configuration going forward. My personal configuration files will live in ~/.privoxy/, so I'll add the following two lines to /opt/local/etc/privoxy/config:

filterfile /Users/michael/.privoxy/user.filter
actionsfile /Users/michael/.privoxy/user.action

Save and close the main cinfiguration file now, that's the last time we'll need to touch it. Now we can create a folder to hold our personal configurations, and add a couple files to it, like so:

$ mkdir /Users/michael/.privoxy
$ touch /Users/michael/.privoxy/user.filter
$ touch /Users/michael/.privoxy/user.action

Ok, we can now start privoxy up, and it will include our own (temporarily empty) configuration files.

$ sudo /opt/local/sbin/privoxy /opt/local/etc/privoxy/config

Now you need to configure your web browsers to use your new proxy. On Mac, go to System Preferences > Network > Advanced > Proxies and set the Web Proxy (HTTP) and Secure Web Proxy (HTTPS) items to use 127.0.0.1:8118. Save and close the Preferences, and open up a web browser like Safari.

To test that your proxy is running and that your web browser is using it, go to the following URL: http://config.privoxy.org/. You should see some information about your running Privoxy application. By default privoxy will filter out most web ads, though you can adjust this. The important thing is we can add our own filtering. To do that we need to edit those two configuration files we created earlier.

First lets create a filter, one that will replace the URL to an image on the NYTimes domain, with an image on my own development server. The rule for that will go in my /Users/michael/.privoxy/user.filter file and looks like this:

FILTER: justtesting This is a test.
s%http://graphics\.nytimes\.com/example\.jpg%http://192.168.0.3/example.jpg%g

(Those aren't real URL's, I've shortened them for purposes of illustration, the point is you can match any pattern in the page source and replace it with any of your own text.)

If you grok Perl regular expressions, then this substitution syntax should be familiar to you. Here I'm defining a filter named "justtesting" which swaps my own example.jpg URL in wherever it sees a URL matching the one for the example.jpg on the NYTimes server.

Save that, and now you're halfway done. The second part of this process is to add that filter to the user.action file. So I will add the following to my /Users/michael/.privoxy/user.action file:

{+filter{justtesting}}
/

The first line turns on the filter name "justtesting" and the second line is a glob pattern saying which web addresses to apply the filter to. Use a forward slash if you want apply the filter to all web addresses. But, if I'd wanted to limit my filter just to the nytimes server I could have written this instead:

{+filter{justtesting}}
*.nytimes.com

Save that, and you're done. You don't need to restart privoxy for those changes to be applied. Now, if I go to the live web site I should see the affects of my switcheroo. Can you spot my somewhat Simpsonsesque family on my view of the live New York Times Technology Page?

NY Times page with my own image inserted

Note: If you want to always run privoxy and you've used port to install it, run the following command on Mac:

$ sudo launchctl load -w /Library/LaunchDaemons/org.macports.Privoxy.plist
$ sudo launchctl list | grep privoxy

permalink | Tags: tools.

contact

tags

archive

more blogs