Getting Rid of Shell Scripts Once and For All

←

Oct 20, 2025
@maxgallup

▶ Table of Contents

Shell scripts are among the most ubiquitous forms of programming and are often found deeply nested in places like system configuration procedures, build scripts and are even called out to from user space programs. They serve as the glue that binds cornerstone infrastructure together which we rely on extensively. Unfortunately, due to their dynamic nature, shell scripts fundamentally don't make it easy to write reliable systems with confidence. Because of the reliance on external factors, shell scripts can easily break when taken to another machine or when a silent assumption was not upheld by the system administrator. This article explores a type system that moves unnecessary run time errors to compile time.

The Problem

The shell is essentially a minimal language whose core capability boils down to: flexibly redirecting inputs and outputs of invoked programs found on the system. It was designed to provide programs maximum flexibility and introduced the concept of composability by piping the output of one program to the input of another. When it was first introduced it massively increased productivity of programmers, because suddenly other programs could be used more like functions, albeit without a standardized input/output type signature. Furthermore, the shell is intrinsically tied to external factors that influence whether a given piece of shell code can execute as intended. For example, the script below will only execute correctly if 1) the program cat is installed on the current system, 2) cat is installed in one of the directories listed in the $PATH environment variable and 3) that the file to be printed resides in the same directory as the directory the program is invoked from. One can easily see how executing a single line of shell code quickly becomes a chain of assumptions external to the shell code itself.

#!/bin/bash
cat file.txt

It's important to note the distinction between correctness of program setup vs correctness of program execution. The later is subject to bugs introduced by the program itself which and are entirely out of scope, while the former is the core concern of this article. However, when a program externalizes an assumption about its input parameters, it becomes a problem that needs to be solved at "setup-time" making it the shell's problem. Coming back to our example, the cat program expects the first argument to be a string that represents the relative path to a file on disk. Thus, it is the shell's responsibility that the file is truly there before the program is executed. The common thread among all of these concerns is that they are only handled at run time forcing the programmer to invest time and effort to handle them then or in the common case: to debug them in production.

Sketches of a Proposal

The goal of this proposal is to lay the foundations for a new language that moves the class of previously discussed errors to compile time. In order to make the most out of that, the compiler needs to capture all external factors that can influence a successful program setup as invariants, so that it can reason about them statically. The main goal of such a language would be to make existing cli tools function calls with strictly typed input arguments and return types. If done right, it would create a sound program preparation framework resulting in the following property:

Key

Well typed programs guarantee correct program preparation. In other words: if it compiles, it can run.

To achieve this, the compiler must cover all external factors that influence program preparation and address them at compile time. Broadly speaking there are two main cases of external factors that the compiler needs to take into account:

Compile time guarantees about presence of any program. In order to reason about any program the compiler must first guarantee that it exists and can be reasoned about. This requires standardizing a software bill of materials and involves declarative package management and automatic dependency resolution.
Type safe input/output contracts for CLIs. The compiler must be informed about the program's input and output requirements in the form of a unified contract. Akin to an API specification, it provides a type safe layer to inform the type system about input requirements of a program and expected output types. The type system for this contract layer must be expressive enough to allow developers to declare the semantics of their program's intention via the types (more on that later). Finally, the contract also informs the compiler about a declarative key-value store of environment variables.

Such a language could enable the modern developer experience when writing shell scripts that is already common place throughout most language ecosystems.

Solution Part 1: Declarative Software Packaging

The first pain point that developers face when writing shell scripts, is that there is no guarantee that a given software package is available in the context of the script. To provide the compiler with the necessary information the developer must declare a "Software Bill of Materials" (SBOM) that lists uniquely identifiable executables. This proposal aims to make no assumptions about the underlying program or how it was built. It aims to exist at a layer of abstraction above existing package management ecosystems. Thus, it must have the following properties:

Each executable was built from a source repository and provides a hash of the source to identify its origins.
Each pre-built executable can be retrieved from a registry and is identified by an owner, a name, version number and the hash of the binary.

For example the following entry would allow the compiler to verify that the source repo for casey's just program at version 1.43.0 has a hash of 5891b5. It also claims that the binary is available for download from the github release registry, but the implementation of the registry does not impact the format of the specification, i.e. it is not registry dependent.

[sbom]
source_hash = "5891b5b522d5df086d0ff0b1"
source_repo = "https://github.com/casey/just"
registry = "github"
owner = "casey"
name = "just"
version = "1.43.0"
hash = "e258d248fda94c63753607f7"

What's important here is the cryptographic link between the binary and the source repository where it originated from. Suddenly, if the SBOM snippet above checks out at compile-time, it results in being able to reason about the presence of a binary. Once such an SBOM is declared, the developer can start writing scripts knowing that the programs are available. The language itself could even have a package manager built over existing ones that co-locates SBOM declaration with source code.

use nix/packages/gnugrep/3.11.0 as grep

grep()...

But wait, what about Nix?

Indeed, Nix tackles this exact problem and could be used to provide static guarantees about the presence of binaries. However, the proposed system is intended to exist at a layer of abstraction higher where Nix fits more as a "registry provider". The underlying mechanism should not rely on the implementation of any one provider, but rather use the identifiable cryptographic information to identify a program and then installing it simply becomes implementation details. That being said, supporting Nix as a registry would be a good place to start with implementation.

Only some registries can provide a truly self contained program. For example, Nix handles the dependency resolution of nix packages by design and Docker bundles all dependencies into an image. Both such approaches could be built on top of, however some package ecosystems don't provide guarantees that all dependencies are installed. For example package managers of programming language (think pip, npm, cargo, etc...) allow packages to have system dependencies in the form of shared libraries. Making it the responsibility of the user to install those dependencies. Those package ecosystems fall out of scope, simply because they can't provide guarantees that a program can run after being installed.

Solution Part 2: OpenCLI

As mentioned before, in order to work with installed programs more like functions with type safe parameter declarations and return types, there must be a contract that declares the input and output types as well any environment variables per program. For some languages, it might be possible to generate such a contract automatically by using static analysis of the source code. Additionally, language specific libraries could be built to centralize and ease the experience of making command line parsers, similar to how OpenAPI offers a wide number of language generators and SDKs.

Warning

Ideally, contracts are defined such that there is a single source of truth. Whether that be the contract itself which the program then automatically implements via language generators, or if the contract is derived from existing program code automatically.

The existing effort of opencli.org and openclispec.org could be expanded to fit the necessary contract requirements of the type system. The types exposed through the contracts would need to encapsulate sufficient semantic meaning of the program's input and output to make use of type safety introduced by the compiler. For example, one could imagine a naive contract for a simplified version of the cat program which only takes a single input parameter:

{
  "opencli": "0.0.1",
  "arguments": [
    {
      "input": {
        "name": "file path",
        "description": "Path on disk file to display to stdout.",
        "type": {
          "string": "utf-8",
        },
        "required": true,
      },
      "output": {
        "name": "Output stream of file.",
        "type": {
          "stream": "stdout",
          "data": "bytes",
        },
      }
    }
  ],
  "env": {
    "KEY": "VALUE"
  },
  "..."
}

Inspiration is taken from the existing OpenCLI specification, but kept short for brevity.

The arguments define a pair of input/ouput types, which allow the compiler to know that when a string is passed as input, the expected result is a stream of output bytes at stdout. Now we could imagine invoking cat as follows in the proposed language:

use nix/packages/uutils-coreutils/0.2.2 as cat;

// Invokes the cat command and attempts the convert the utf_8 stream
// into a concrete String type
let file_text: String = cat("file.txt").stream_utf_8();

print(file_text);

As long as the contract faithfully reflects the program's behavior this program already provides much more confidence about the input and return types. However, since the contract defines a string as an input type for the file, we can easily fall into the same trap by providing an incorrect filename. Ideally, we want the compiler to be aware of the file presence, which means we can expand the contract to use a native File type as input instead.

  // ...
      "input": {
        "name": "file path",
        "description": "Path on disk file to display to stdout.",
        "type": {
          "file": {
            "kind": "utf-8",
          }
        },
        "required": true,
      },

Now we are forced to handle the case where the file doesn't yet exist because the compiler requires a native File type. We can provide a default implementation if it doesn't exist yet.

use nix/packages/uutils-coreutils/0.2.2 as cat;

// Declares a file to be present at "./file.txt"
// If not present at runtime, it gets created with default content
let file: File = File {
    path: project_root!() / "file.txt",
    default_with: "Hello from inside the file!",
};

let file_text: String = cat(file).stream_utf_8();

print(file_text);

The true power of the type system is unlocked when the contracts represent the intended meaning of program inputs and outputs, i.e. representing files as File types instead of just strings.

An Involved Example

The following is a more involved example where the file output of one program becomes the input to another. The syntax is no where near set in stone and takes mostly from Rust to highlight the provided type safety. It also assumes that pandoc and curl have OpenCLI contracts.

use nix/packages/pandoc/3.6.0 as pandoc;
use docker/alpine/curl/8.14.1 as curl;

// Declares that src directory must exist and intentionally propagates
// the error up if it doesn't via the ? operator
let src_dir: File = File {
    path: project_root!() / "src",
    empty: false
}?;

// Generate PDF - pandoc returns a File type representing the output
let pdf: File = pandoc(src_dir)
    .format("pdf")
    .output(project_root!() / "output.pdf")?;

// Read PDF as bytes and encode to base64
let pdf_bytes: Bytes = pdf.as_bytes();
let pdf_base64: String = pdf_bytes.encode_base64();

// Upload via POST request - curl returns HttpResponse
let upload_result: curl::Response = curl()
    .method(curl::HttpMethod::Post)
    .url("https://example.com/upload")
    .header("Content-Type", "application/json")
    .body({
        "file": pdf_base64,
        "filename": "output.pdf"
    })
    .send()?;

match upload_result.status_code() {
    200..299 => info!("Upload successful!"),
    code => info!("Upload failed with status: {code}"),
}

Explanation:

use statements at the top of the file introduce the programs we would like to use. We specify a registry/package-identifier/version/ to identify a package. The package manager is aimed to be abstract over existing registries and package management ecosystems allowing the use of tools from different sources. By doing this, the developer is only burdened with installing the requiring package management infrastructure which would be Nix and Docker in this case. Only registries that handle dependency resolution of their artifacts themselves would be supported. So for example, packages in Python's Pip registry famously have silent installation dependencies in the form of shared library requirements which aren't be resolved statically by Pip.
An expressive, rich native type system that includes native types like File allow the compiler to reason about the existence of files statically. For example, we declare that there is a non empty source directory at the project's root.
Through their type contracts, programs declare the typed function interface that they support. For example, pandoc() takes as as input a File type and returns a builder pattern which exposes more options. The type signature for it could be something like: pandoc(file: File) -> PandocBuilder. Then, there are more functions that (through the builder pattern) force the programmer to specify the rest of the input with type safety. For example the signature of the output function would be: output(self: PandocBuilder, output_file: File) -> File.
The standard library exposes native types that would have useful conversion methods for common formats and file types. For example the Bytes can be encoded as a base64 string with the encode_base64() method.
In their type contracts, programs can define their own types to provide the script author with a safer contract. For example, similar to pandoc, the curl builder pattern eventually returns a custom type curl::Response which exposes more type safety to the script.

However, one important question remains: How does compiler know the type interface of nix/packages/pandoc/3.6.0? One solution would be to embed the type contract into the program itself, which forces the type contract and the implementation of the type contract to live as close to one another as possible. Then, as part of the specification, there could be a convention to reserve the opencli subcommand of every cli to simply return the type contract to the compiler.

Conclusion

The shell has served us remarkably well for decades, but its fundamental design assumptions—dynamic execution, reliance on implicit environmental state, and lack of type safety—have become increasingly incompatible with modern software engineering practices. While shell scripts remain ubiquitous in critical infrastructure, their brittleness and opacity create a persistent source of preventable failures that manifest only at runtime, often in production environments.

The language proposed in this article represents a paradigm shift: treating command-line tools as first-class, type-safe functions with explicit contracts rather than opaque black boxes invoked with string arguments. By moving program availability checks, file system interactions, and input/output validation to compile time, we can achieve the core promise that well-typed programs guarantee correct program preparation. This single property eliminates an entire class of errors that have plagued shell scripting since its inception.

The combination of declarative software bills of materials and OpenCLI-style contracts creates a foundation where developers can reason about their automation scripts with the same confidence they have in typed application code. IDE autocompletion, refactoring tools, and static analysis—luxuries taken for granted in modern programming—suddenly become available for the infrastructure glue that holds our systems together.

Of course, realizing this vision presents significant challenges. Establishing comprehensive OpenCLI contracts for the vast ecosystem of existing command-line tools would require substantial community coordination. The abstraction layer over diverse package management ecosystems must be carefully designed to avoid becoming either too restrictive or too leaky. And there remains the fundamental question of adoption: convincing developers to learn a new language and toolchain when bash scripts, however flawed, are familiar and "good enough."

Yet the potential impact justifies the effort. Countless developer hours are lost each year debugging shell scripts that fail due to missing dependencies, incorrect file paths, or malformed command arguments—errors that a sufficiently expressive type system could catch immediately. As infrastructure-as-code practices continue to proliferate and systems grow increasingly complex, the need for more robust automation tooling becomes ever more pressing.

The path forward requires not just technical innovation, but ecosystem building: creating comprehensive package registries, establishing contract standards, developing compiler infrastructure, and fostering a community willing to rethink how we approach the unglamorous but essential work of system automation. The shell revolutionized computing by making programs composable. Perhaps it's time for making that composition safe, reliable, and verifiable by design.