02 June 2010

tar, rsync and netcat

In my quest to quickly move files from my dlink nas and my new file server I was looking for a quick way to transfer files (and maybe keep them in sync later on). I tried tar over ssh, however this is rather cpu intensive for a little nas box with a small ARM processor and 64 MB of RAM. I then started looking at netcat, since it doesn't use encryption (and thus much cpu). This doesn't have to be secure since the storage network in my case is physically and logically isolated from the rest of the network. Here is what I found:

On the receiving end (do this first):

#nc -l 7000 | tar xvf -


Or on some systems:
#nc -l -p7000 | tar xvf -

And on the sending end:
tar cvf - * | nc hostip:7000

Or on some systems (like Mac OS X or *BSD):
tar cvf - * | nc hostip 7000 (note the space not the colon)

this will send everything in the current directory from the sender to the current directory of the receiver. The port doesn't matter, just use an unused one (port 7000 I believe is for AFS which I don't use). This should help preserve file permissions, etc. as well.

Just an update: this worked, though it took a while, it did successfully copy everything (ran spot check md5 sums to make sure).

Another clever one is to tar and gzip them before sending them off:
On the receiving end:

# nc -l 7000 > filename.tar.gz

Here you are saying whatever comes out of netcat is part of this archive, so you redirect and concatenate it into the file you just specified.

And on the sending end:

# tar cvf - * | gzip -9 | nc hostip:7000

Here you are just piping the output of tar into gzip. The -9 is to specify the level of compression (man gzip for more details) and then send it off to netcat.

I also verified with rsync, yet another good way to now keep these files in sync. Sure enough, it only copied a few files that changed (Mac's annoying .DS_store and a couple of touch'd files).

#rsync -ave ssh user@hostip-or-fqdn:/source/dir/path /destination

One thing to note is that the source is the full path to the directory to copy from, and the destination is where you want the entire tree - so path would end up being a subdirectory of destination: /destination/path, but leave out path.

No comments: