Tux

...making Linux just a little more fun!

2-cent tip: A safer 'rm'

Henry Grebler [henrygrebler at optusnet.com.au]


Sun, 20 Jun 2010 15:45:14 +1000

Hi,

[I wrote what follows nearly a month ago. Since I started working full-time-plus, I have not been able to respond as promptly as I would prefer. For many, I suspect, the subject has gone cold. Nevertheless, ...]

As often happens, I have not phrased myself well. [sigh]

I apologise for trying to respond quickly, when clearly a fuller and more considered response was indicated.

Let me try again.

As I understand it, this is the original statement of the problem:

If you've ever tried to delete Emacs backup files with

rm *~

(i.e. remove anything ending with ~), but you accidentally hit Enter before the ~ and did "rm *", ...

Silas then goes on to suggest a solution.

Perhaps I took the problem statement too literally. When I wrote cleanup.sh, it was aimed exclusively at emacs backup files. When I talked about it, I was thinking exclusively of emacs backup files. I am not aware of any other place where tildes are used.

If it's a file I've been editing with emacs, I've usually created it. So I get to choose its name.

The message I am trying to convey to TAG's broad readership is this. If you are happy with Silas's solution, that's fine. If you are happy with any of the other solutions provided, that's fine too. I myself have felt the need to deal with this problem. I offer my approach to dealing with emacs backup files for your consideration.

Part of my approach is the cleanup script; this is the technical outcome, the tangible I can offer to others.

More important is a holistic approach to the problem as originally stated. I took it to mean that certain behaviours are more risky than others. This is again in the eyes of the beholder. Some people always alias 'rm' to 'rm -i'. If they do this, they will always be prompted before deleting any file. If they specify multiple files, there will be a prompt for each file individually.

Such an approach would have saved Silas.

Again, if such an approach appeals to you, by all means adopt it.

It seems to me ill-advised to adopt a safe behaviour in one circumstance and a risky behaviour in another. The nett effect will not be as safe as your safest behaviour, but rather as risky as your riskiest behaviour. (A chain is no stronger than its weakest link.)

So, first of all, according to my philosophy, don't go creating files with dangerous characters in them. It's really not important to me to call a file 'a b' when I can just as easily call it 'a_b'. If I download a file with spaces in the filename AND I plan to edit it with emacs, it's worth renaming the file, replacing spaces with underscores (or any other innocuous character).

And I am strongly recommending this behaviour of mine to others.

If you live largely in a Linux world, you should rarely encounter filenames containing dangerous characters. If you are creating files, you can choose to avoid dangerous characters. Or not. It's up to you.

Second, according to my philosophy, unusual situations deserve special handling, and, in the case of file deletion, special names. 'cleanup' bears no similarity to 'rm' - deliberately not. It did not take me long to get used to the idea that, if I wanted to get rid of all those pesky files with tildes at the end of their name, I should use the 'cleanup' command (which may, or may not, be associated with the 'rm' command - I don't need or want to think about that). YMMV. It may be easier for me than for others, since I operate in a HAL - Henry Abstraction Layer.

What does that leave? Yes, sometimes you encounter files from alien OSes with dangerous characters in their names. Then you will need to be especially cautious. For all other situations, I think 'cleanup' is ok.

Addressing the wider problem of writing resilient scripts which can handle any characters in filenames, my view is that I cannot write a script that is so rock solid that I can say, with hand on heart, that it can handle any character in a filename.

For readers who think it is easy, here's a handful of files I was able to create with just a few minutes of imagination:

-rw-rw----  1 henryg  wheel  0 May 22 15:17 *
-rw-rw----  1 henryg  wheel  0 May 22 15:17 ****
-rw-rw----  1 henryg  wheel  0 May 22 15:18 **\\**
-rw-rw----  1 henryg  wheel  0 May 22 15:17 -rf
-rw-rw----  1 henryg  wheel  0 May 22 15:21 aaa\000bbb
-rw-rw----  1 henryg  wheel  0 May 22 16:07 abc"def"ghi
-rw-rw----  1 henryg  wheel  0 May 22 16:58 abc'def"ghi
-rw-rw----  1 henryg  wheel  0 May 22 16:08 abc'def'ghi
-rw-rw----  1 henryg  wheel  0 May 22 16:08 qqq?rrr

BTW, the last file does NOT have a question mark in its name; that's just how ls displays it. It's the character ETX (hex 3).

One can handle such files individually and manually. In a script it is more difficult.

I think one can be more confident if one writes in a real programming language (rather than a script). People more familiar with perl might disagree.


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Sun, 20 Jun 2010 23:33:24 -0400

On Sun, Jun 20, 2010 at 03:45:14PM +1000, Henry Grebler wrote:

> 
> Addressing the wider problem of writing resilient scripts which can
> handle any characters in filenames, my view is that I cannot write a
> script that is so rock solid that I can say, with hand on heart, that it
> can handle any character in a filename.

Skipping pretty much all of the preceding bits - which make a lot of sense in the specified context, with "YMMV" stamped in contrasting colors all over the outside of the box - I'm going to jump to the technical issue that Henry has raised here. It's a fun one. :)

> For readers who think it is easy, here's a handful of files I was able
> to create with just a few minutes of imagination:
> 
> -rw-rw----  1 henryg  wheel  0 May 22 15:17 *
> -rw-rw----  1 henryg  wheel  0 May 22 15:17 ****
> -rw-rw----  1 henryg  wheel  0 May 22 15:18 **\\**
> -rw-rw----  1 henryg  wheel  0 May 22 15:17 -rf
> -rw-rw----  1 henryg  wheel  0 May 22 15:21 aaa\000bbb
> -rw-rw----  1 henryg  wheel  0 May 22 16:07 abc"def"ghi
> -rw-rw----  1 henryg  wheel  0 May 22 16:58 abc'def"ghi
> -rw-rw----  1 henryg  wheel  0 May 22 16:08 abc'def'ghi
> -rw-rw----  1 henryg  wheel  0 May 22 16:08 qqq?rrr
> 
> BTW, the last file does NOT have a question mark in its name; that's
> just how ls displays it. It's the character ETX (hex 3).
> 
> One can handle such files individually and manually. In a script it is
> more difficult.

Not a problem:

-rw-r--r-- 1 ben ben 0 2010-06-20 22:43 * -rw-r--r-- 1 ben ben 0 2010-06-20 22:43 **** -rw-r--r-- 1 ben ben 0 2010-06-20 22:43 **\\** -rw-r--r-- 1 ben ben 0 2010-06-20 22:43 aaa\000bbb -rw-r--r-- 1 ben ben 0 2010-06-20 22:43 abc'def'ghi -rw-r--r-- 1 ben ben 0 2010-06-20 22:43 abc'def"ghi -rw-r--r-- 1 ben ben 0 2010-06-20 22:43 abc"def"ghi -rw-r--r-- 1 ben ben 0 2010-06-20 22:43 qqq?rrr -rw-r--r-- 1 ben ben 0 2010-06-20 22:43 -rf

Here's the 'ls -lb' version of the same directory, just so we're on the same page:

-rw-r--r-- 1 ben ben 0 2010-06-20 22:44 * -rw-r--r-- 1 ben ben 0 2010-06-20 22:44 **** -rw-r--r-- 1 ben ben 0 2010-06-20 22:44 **\\\\** -rw-r--r-- 1 ben ben 0 2010-06-20 22:44 aaa\\000bbb -rw-r--r-- 1 ben ben 0 2010-06-20 22:44 abc'def'ghi -rw-r--r-- 1 ben ben 0 2010-06-20 22:44 abc'def"ghi -rw-r--r-- 1 ben ben 0 2010-06-20 22:44 abc"def"ghi -rw-r--r-- 1 ben ben 0 2010-06-20 22:44 qqq\003rrr -rw-r--r-- 1 ben ben 0 2010-06-20 22:44 -rf

Solution to removing these one at a time, programmatically:

#!/bin/bash
# Created by Ben Okopnik on Sun Jun 20 22:47:55 EDT 2010
 
IFS='
'
 
for f in *; do rm "./$f"; done

I've found this approach to be quite robust. You could even do

IFS=$'\n'; for f in *; do rm "./$f"; done

if you wanted to do it as a command-line one-liner.

> I think one can be more confident if one writes in a real programming
> language (rather than a script). People more familiar with perl might
> disagree.

I recall speaking to someone who had attended a seminar held by David Korn (of the Korn shell fame) in which David had convinced the audience that you can indeed write robust and secure shell scripts. The fellow had quite the post-epiphany glow on him. :) For myself, I find that shell scripting is a fairly complete functionality set for handling literally any user-level operation that can happen in the file system: any time I think I've found an exception, a little digging usually turns up a function that handles the specific thing I was looking for. It's not quite an article of faith with me, but it's pretty close.

As for Perl, it's a great tool that makes most things easier... with the one exception being the exact functionality set I mentioned above. Almost anything to do with the file system can be done easier and more simply in the shell. Perl shines when problems get beyond that simple level.

-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * https://LinuxGazette.NET *


Top    Back