June 2017 – Dave Rix

When writing bash scripts, I often connect a series of commands together into a pipeline, where the commands are separated by the pipe | character.

When using pipelines, the output from the first command is treated as the input to the second command, the output of the second command is treated as the input to the third command, and so on.

This can be useful in a number of situations, such as when you need to process the output of a command further before displaying or assigning to a variable.

For example, given a file containing a sequence of numbers

$ cat numbers.txt
2250
2262
1
1
1
15379
15379
1
16112
16121

We can find the numbers in the file with the largest distribution as follows

$ sort -n numbers.txt | \
        uniq -c | \
        sort -rn | \
        head
141 2
 69 1685
 59 1
 53 2950
 11 1902
  4 2870
  4 2132
  3 9151
  3 4345
  3 1796

Where we first sort the contents of the file, using -n to sort them numerically, then pipe that output into the uniq command with the -c option to count the unique values, then sort again, this time with -rn for reverse numeric order, and finally, take the first 10 entries in the output (10 is the default number of lines that head will return.)

After spending over 25 years in the industry as a full-stack developer, Linux admin, MySQL DBA, Data Architect and Infrastructure Architect I thought it was about time I started to write down some of my findings and experience over the years.

My current specialist areas are MySQL and Amazon Web Services (AWS). In the MySQL space, I work with all areas from installation and upgrade, to performance tuning and High Availability configurations.

In the MySQL space, I work with all areas from installation and upgrade, to Performance Tuning and High Availability configurations – in particular, backup and restore, automatic fail-over and disaster recovery.

I cover most areas of AWS, with a particular focus on automated infrastructure builds using Terraform and Ansible of the full stack from CloudFront through Elastic Load Balancing to EC2 instances, Redshift, DynamoDB, S3 and RDS databases – focusing here on MySQL and Aurora.

I intend to write a number of articles on the areas I have worked on, highlighting some of the challenges and how to overcome them. Hopefully, this will be of some use to anyone who stumbles across this site in the future!

Month: June 2017

Command Pipelines

I’m finally getting started!