Tag Archives: unix

Quickly Scrambling a File (unsort)

The Unix sort command is one of those handy tools that make it such a great environment for mucking about in.

A less-common need is to shuffle lines in a file. Here’s a one-liner that I found on the comp.unix.shell archives. Its performance seems to be reasonable for casual use, i.e. on the 51,000 line file I have, it took 1.3S.

$ cat unsort
perl -wne 'printf "%016.0f%s", rand 2**53, $_'  | sort | cut -b17-
$

An example:

$ cat file.sorted 
01 To be, or not to be: that is the question:
02 Whether 'tis nobler in the mind to suffer
03 The slings and arrows of outrageous fortune,
04 Or to take arms against a sea of troubles,
05 And by opposing end them? To die: to sleep;
06 No more; and by a sleep to say we end
07 The heart-ache and the thousand natural shocks
08 That flesh is heir to, 'tis a consummation
09 Devoutly to be wish'd. To die, to sleep;
10 To sleep: perchance to dream: ay, there's the rub;
11 For in that sleep of death what dreams may come
12 When we have shuffled off this mortal coil,
13 Must give us pause: there's the respect
14 That makes calamity of so long life;
15 For who would bear the whips and scorns of time,
16 The oppressor's wrong, the proud man's contumely,
17 The pangs of despised love, the law's delay,
18 The insolence of office and the spurns
19 That patient merit of the unworthy takes,
20 When he himself might his quietus make
21 With a bare bodkin? who would fardels bear,
22 To grunt and sweat under a weary life,
23 But that the dread of something after death,
24 The undiscover'd country from whose bourn
25 No traveller returns, puzzles the will
26 And makes us rather bear those ills we have
27 Than fly to others that we know not of?
28 Thus conscience does make cowards of us all;
29 And thus the native hue of resolution
30 Is sicklied o'er with the pale cast of thought,
31 And enterprises of great pith and moment
32 With this regard their currents turn awry,
33 And lose the name of action. - Soft you now!
34 The fair Ophelia! Nymph, in thy orisons
35 Be all my sins remember'd.
$  ./unsort  < file.sorted > file.unsorted
$  cat file.unsorted 
32 With this regard their currents turn awry,
18 The insolence of office and the spurns
34 The fair Ophelia! Nymph, in thy orisons
19 That patient merit of the unworthy takes,
14 That makes calamity of so long life;
04 Or to take arms against a sea of troubles,
10 To sleep: perchance to dream: ay, there's the rub;
06 No more; and by a sleep to say we end
24 The undiscover'd country from whose bourn
16 The oppressor's wrong, the proud man's contumely,
15 For who would bear the whips and scorns of time,
09 Devoutly to be wish'd. To die, to sleep;
17 The pangs of despised love, the law's delay,
30 Is sicklied o'er with the pale cast of thought,
29 And thus the native hue of resolution
12 When we have shuffled off this mortal coil,
31 And enterprises of great pith and moment
11 For in that sleep of death what dreams may come
03 The slings and arrows of outrageous fortune,
28 Thus conscience does make cowards of us all;
02 Whether 'tis nobler in the mind to suffer
27 Than fly to others that we know not of?
13 Must give us pause: there's the respect
07 The heart-ache and the thousand natural shocks
21 With a bare bodkin? who would fardels bear,
33 And lose the name of action. - Soft you now!
20 When he himself might his quietus make
05 And by opposing end them? To die: to sleep;
35 Be all my sins remember'd.
26 And makes us rather bear those ills we have
23 But that the dread of something after death,
01 To be, or not to be: that is the question:
25 No traveller returns, puzzles the will
22 To grunt and sweat under a weary life,
08 That flesh is heir to, 'tis a consummation

Running Scripts in PowerShell

These are the steps that I had to take in order to create and run scripts in PowerShell. First things first… privileges.

No Scripts Allowed

PowerShell will not allow you to create scripts and run them by default. If you attempt to do so, you will probably receive the following error message:

File C:\Users\foo\bar.ps1 cannot be loaded because the execution of scripts is disabled on this system. Please see “get-help about_signing” for more details.

If you run the Get-ExecutionPolicy cmdlet, the default policy is Restricted. This means a blanket prohibition on scripts. The possible Get-ExecutionPolicy values are:

Policy Meaning
Restricted Scripts are prohibited
Default Normally corresponds to Restricted
AllSigned Only scripts with valid digital signatures may be executed.
RemoteSigned Local scripts may be run. Scripts from the Internet or other “public” place must bear valid digital signatures.
Unrestricted Any and all scripts can be executed.

Setting the Required Privileges

For typical software engineer and power user work, the RemoteSigned setting is appropriate. This will allow you to create, modify, and execute your own scripts while retaining the Internet barrier.

To change the permissions,

  1. Run PowerShell with administrator privileges. The execution policy cannot be altered as a normal user.
  2. Run the command Set-ExecutionPolicy RemoteSigned

PowerShell will now allow you to run local scripts that you create.

File Extension

DOS batch files have the extension bat. The default file extension for PowerShell scripts is ps1.

Execution

PowerShell looks like Unix here. Unless the script is in the Path environment variable — and assuming the script is in your current directory — you will need to mimic the Unix dot-slash notation (using the backslash, of course).

For example, given the script call bar.ps1 in your current directory, if you attempt to run the script directly, you’ll get an error as follows.

PS C:\Users\foo> bar.ps1
The term 'bar.ps1' is not recognized as a cmdlet, function, operable program,
or script file. Verify the term and try again.
At line:1 char:5
+ bar.ps1 <<<< 

Instead you'll have to put .\ in front of the script.

PS C:\Users\foo> .\bar.ps1
Hello, World!

If you do store the script in the directory search path, you will only need to type the file name (without the ps1 extension). This behaviour is similar to batch files.

PS C:\Users\foo> bar
Hello, World!

In-Process Execution

Also identical to the Unix shell, PowerShell spawns a new process in which to execute the script. If, for example, the script makes changes to an environment variable, that change will be lost when the script terminates.

To execute the script in the current process context, use the dot-space notation:

PS C:\Users\foo> . .\bar.ps1
Hello, World!

Venting On PowerShell

The bulk of my site consists of personal notes that I make public; it’s mostly informational with a few random bits thrown in. Here I’m breaking with that mold and actually breaking into commentary since I’ve just spent the last couple of hours banging my head against the wall.

As arcane and dated as the Unix-style shell may be, over the decades it has proven itself to be a powerful tool for system administrators, developers, and power users who interact with with Unix-like systems. It’s a capable tool even now in the 21st century.  The stark lack of an equivalent tool in the Windows operating systems has been a long source of heartburn when trying to accomplish similar tasks in an efficient manner. Over the years there have been various attempts to bring Unix-style tools to Windows, (such as cygwin) with varying degrees of success. Nothing really took hold, however.

PowerShell is an attempt to address this gaping hole in the Windows world. For quite some time I didn’t really “get” what problem space PowerShell tries to address. This is because the documentation and web sites surrounding PowerShell never really come out and say, “This is our attempt to create a Unix-style shell based on the core concepts that make the Unix-style shell so successful.” They generally talk about how it’s based on .NET objects, and how everything is a native .NET object, blah blah blah — the focus being on the internals and not the problem space.

Again I ran into a situation where not having a Unix-style shell causes a lot of pain, so I sat down and started to plow through the e-book Mastering PowerShell from the PowerShell.com web site. The first pass through the initial chapters was skimming in order to divine what PowerShell is. I went through the Unix-like examples, and it became clear quickly that PowerShell’s aim is to be a native Windows implementation of the core Unix shell concepts.

The core concept in the Unix shells is that one should be able to arbitrarily chain small, single-purpose utilities together to get the exact data that you want. If there’s something that doesn’t do what you want, then you write a small tool that takes data off of stdin, processes it, and pipes to stdout. Because C is not the most friendly of languages for these types of tasks, “Swiss army knife” languages like Perl evolved. Over time some tools like grep grew less single-purpose as common tasks that took a lot of typing compressed into shorter commands. For the shell user, economy of keystrokes reigns over theoretical purity. Nevertheless, that core paradigm never changed despite the inconsistencies and crazy contradiction that one runs into.

The designers of PowerShell followed this model, that of streaming data from one small utility to the next. The primary difference, which is where web sites and documentation get hung up on, is that whereas the Unix-style shell streams bytes, PowerShell streams .NET object instances. (Unfortunately the designers also brought along inconsistencies and unintuitive behaviour, but that’s a discussion for another day.)

Fortunately for those of us who are comfortable with Unix-style shells, much of the core syntax was adopted wholesale. For example one pipes with the vertical bar (“|”), and redirects to files with “>”. There are a number of pre-defined aliases that make the initial brush with PowerShell superficially Unix-like. For example, one will find aliases for ls, cd, rm, history, sort, etc.

However, when you try to move beyond that thin veneer of compatibility and perform simple and common tasks, your life will become hellish as you unceremoniously plow into that devil that resides in the details. As an example, my search was to replicate the functionality of the instinctive command

help | grep -i get

The answer that I finally was able to track down, is the obvious and intuitive

(help | out-string -stream) | select-string get

(For those with broken sarcasm meters, that was full-on sarcasm.) When searching for an answer, the thing that disappointed me was the common attitude that the “PowerShell paradigm” is more pure and thus far superior to the outmoded Unix-style format. This is, of course, ridiculous because it directly violates economy of motion (keystrokes), a fancy way to say that it can be bloody cumbersome. I heartily recommend a review of Larry Wall’s humourous but insightful virtues of programmers: Laziness, Hubris, and Impatience. I’m tempted to tear into this attitude, but the software world is no stranger to sacrificing usability and intuitiveness on the altar of theoretical purity and overengineering.

Note that this is not a blanket condemnation of PowerShell. I wanted to do a handful of simple commands, and it took me a long time to piece together a series of comparatively convoluted statements. In other words, in those instances PowerShell made the simple difficult. On the other hand, there are some things (like the [xml] data type) that make the difficult drop dead simple. Those things exhibit great elegance.

The long and short of it, from my perspective, is that PowerShell will not be simple for Un*x users to pick up. Expect to bang your head against the wall for some time until you understand PowerShells’ quirks.