aboutsummaryrefslogtreecommitdiff
path: root/source/_posts
diff options
context:
space:
mode:
authorZhiming Wang <zmwangx@gmail.com>2015-02-10 03:19:24 -0800
committerZhiming Wang <zmwangx@gmail.com>2015-02-10 03:19:24 -0800
commitcbc236a1ee5646dfea296a9c73ba71c16554311a (patch)
treee38e4bd13228bdcedd5b7b23ffcedce4c45ce069 /source/_posts
parentcf7fb969c6812b2c34a3153fc39e3cd86f03dc1a (diff)
downloadmy_new_personal_website-cbc236a1ee5646dfea296a9c73ba71c16554311a.tar.xz
my_new_personal_website-cbc236a1ee5646dfea296a9c73ba71c16554311a.zip
20150210 Monitor progress of your Unix pipes with pv
Diffstat (limited to 'source/_posts')
-rw-r--r--source/_posts/2015-02-10-monitor-progress-of-your-unix-pipes-with-pv.md60
1 files changed, 60 insertions, 0 deletions
diff --git a/source/_posts/2015-02-10-monitor-progress-of-your-unix-pipes-with-pv.md b/source/_posts/2015-02-10-monitor-progress-of-your-unix-pipes-with-pv.md
new file mode 100644
index 00000000..766f877d
--- /dev/null
+++ b/source/_posts/2015-02-10-monitor-progress-of-your-unix-pipes-with-pv.md
@@ -0,0 +1,60 @@
+---
+layout: post
+title: "Monitor progress of your Unix pipes with pv"
+date: 2015-02-10 02:18:30 -0800
+comments: true
+categories:
+---
+Recently I found a very useful utility called `pv` (for "pipe viewer"). [Here](http://www.ivarch.com/programs/pv.shtml) is its home page, and it can be easily installed with `brew`. According to its man page,
+
+> `pv` shows the progress of data through a pipeline by giving information such as time elapsed, percentage completed (with progress bar), current throughput rate, total data transferred, and ETA.
+
+For more info, see its home page (linked above) and [man page](http://linux.die.net/man/1/pv).
+
+Why is it useful? Well, pretty obvious if you are in the right audience. For me, one particularly important use case is with `openssl sha1`. I deal with videos on a daily basis, and back up all of them to OneDrive (ever since OneDrive went unlimited). To ensure integrity of transfer (in future downloads), I append the first seven digits of each video to its filename. This should be more than enough to reveal any error in transfer expect for active attacks. One additional advantage is that I can now have multiple versions of a same show, event, or whatever and don't have to worry about naming conflicts (and don't have to artifically say `-ver1`, `-ver2`, etc.). This little merit turns out to be huge and saves me a lot of trouble, since naming things is intrincally hard:
+
+> There are only three hard things concurreny, in computer science: cache invalidation, naming things, and off-by-one errors.
+
+(I learned this beefed up version of two hard things only recently.) Well, too much digression. So SHA-1 sum is useful. (By the way, I learned in my crypto class that SHA-1 is broken as a collision-resistant hash function — not HMAC, which doesn't assume collision-resistance — and SHA-256 should be used instead. However, I'm not protecting against active attacks — I won't be able to without a shared secret key anyway — so the faster SHA-1 is good for my purpose.) But at the same time, SHA-1 is slow. Maybe what's actually slow is my HDD. Whatever the bottleneck, generating a SHA-1 digest for a 10 GB video file isn't fun at all; it's even more of a torture when there's no progress bar and ETA. But hopelessly waiting has become a thing of the past with the advent (well, discovery in my case) of `pv`. Now I have nice and informative progress bars, which reduces the anxiety of waiting by an order of magnitude.
+
+For the record, here's the current version of my ruby script that attaches the first seven digits of the SHA-1 digests of the given files to their filenames:
+
+```ruby 7sha1
+#!/usr/bin/env ruby
+
+require 'fileutils'
+
+def rename(items)
+ num_items = items.length
+ num_done = 0
+ items.each {|path|
+ printf($stderr, "%d/%d: %s\n", num_done + 1, num_items, File.basename(path))
+
+ if ! File.directory?(path)
+ extname = File.extname(path)
+ basename = File.basename(path, extname)
+ dirname = File.dirname(path)
+ sha1sum = `pv '#{path}' | openssl sha1`
+ new_basename = basename + "__" + sha1sum[0,7]
+ new_path = File.join(dirname, new_basename + extname)
+ FileUtils.mv(path, new_path)
+ else
+ STDERR.puts("#{path}: directory ignored")
+ end
+
+ num_done += 1
+ }
+end
+
+rename(ARGV)
+```
+
+You might ask why I used ruby (littered with bash) when it's obviously a job for bash or perl. Well, the reason is that I first wrote this thing in ruby as a [Dropzone 3 action](https://gist.github.com/zmwangx/d6406fb8bf51ac768770). I'm lazy, so I just borrowed that script and modified its printout for shell use.
+
+---
+
+By the way, I also found a project called `cv` (Coreutils Viewer), which is [officially described as](https://github.com/Xfennec/cv)
+
+> ... a Tiny, Dirty, Linux-Only C command that looks for coreutils basic commands (cp, mv, dd, tar, gzip/gunzip, cat, etc.) currently running on your system and displays the percentage of copied data.
+
+. I'll look into it when I have time, but it from its description, it seems to be limited to coreutils, and OS X support might not be too awesome (at this point).