This allows me to rapidly iterate on shell pipelines. The main goal is to minimize my development latency, but it also has positive effects on dependencies (avoiding redundant RPC calls). The classic way of doing this is storing something in temporary files:
up(1) looks really cool, I think I'll add it to my toolbox.
It looks like up(1) and memo(1) have similar use cases (or goals). I'll give it a try to see if I can appreciate its ergonomics. I suspect memo(1) will remain my mainstay:
1. After executing a pipeline, I like to press the up arrow (heh) and edit. Surprisingly often I need to edit something that's *not* the last part, but somewhere in the middle. I find this cumbersome in default line editing mode, so I will often drop into my editor (^X^E) to edit the command.
2. Up seems to create a shell command after completion. Avoiding the creation of extra files was one of my goals for memo(1). I'm sure some smart zsh/bash integration could be made that just returns the completed command after completing.
Another thing I built into memo(1) which I forgot to mention: automatic compression. memo(1) will use available (de)compressors (in order of preference: zstd, lz4, xz, gzip) to (de)compress stored contents. It's surprising how much disk space and IOPS can be saved this way due to redundancy.
I currently only have two memoized commands:
$ for f in /tmp/memo/aktau/* ; do
ls -lh "$f" =(zstd -d < $f)
done
-rw-r----- 1 aktau aktau 33K /tmp/memo/aktau/0742a9d8a34c37c0b5659f7a876833b6dad9ec689f8f5c6065d05f8a27d993c7bbcbfdc3a7337c3dba17886d6f6002e95a434e4629.zst
-rw------- 1 aktau aktau 335K /tmp/zshSQRwR9
-rw-r----- 1 aktau aktau 827 /tmp/memo/aktau/8373b3af893222f928447acd410779182882087c6f4e7a19605f5308174f523f8b3feecbc14e1295447f45b49d3f06da5da7e8d7a6.zst
-rw------- 1 aktau aktau 7.4K /tmp/zshlpMMdo
In general, I wonder if we're at the point where an LLM watching you interact with your computer for twenty minutes can improve your workflow, suggest tools, etc. I imagine so, because when I think to ask how to do something, I often get an answer that is very useful, so I've automated/fixed far more things than in the past.
#!/usr/bin/env bash
#
# memo(1), memoizes the output of your command-line, so you can do:
#
# $ memo <some long running command> | ...
#
# Instead of
#
# $ <some long running command> > tmpfile
# $ cat tmpfile | ...
# $ rm tmpfile
to save output, sed can be used in the pipeline instead of tee
for example,
x=$(mktemp -u);
test -p $x||mkfifo $x;
zstd -19 < $x > tmpfile.zst &
<long running command>|sed w$x|<rest of pipeline>;
# You can even use it in the middle of a pipe if you know that the input is not
# extremely long. Just supply the -s switch:
#
# $ cat sitelist | memo -s parallel curl | grep "server:"
grep can be replaced with sed and search results sent to stderr
< sitelist curl ...|sed '/server:/w/dev/stderr'|zstd -19 >tmpfile.zst;
or send search results to stderr and to some other file
sed can save output to multiple files at a time
< sitelist curl ...|sed -e '/server:/w/dev/stderr' -e "/server:/wresults.txt"|zstd -19 >tmpfile.zst;
If provide sample showing (a) input format of text and (b) desired output format of text, then perhaps can provide an example of how to do the text processing
I use Warp terminal for couple of years, and recently they embeeded AI into it. At first I was irritated, disabled it, but AI Agent is built in as an optional mode (Cmd-I to toggle). And I found myself using it more and more often for commands that I have no capacity or will to remember or dig through the man pages (from "figure out my IP address on wifi interface" to "make ffmpeg do this or that"). It's fast and can iterate over own errors, and now I can't resist using it regularly. Removes the need for "tools to memorize commands" entirely.
I've been using bkt (https://github.com/dimo414/bkt) for subprocess caching. It has some nice features, like providing a ttl for cache expiration. In-pipeline memoization looks nice, I'm not sure it supports that
I was not aware of bkt. Thanks for the link. It seems very similar to memo, and has more features:
- Explicit TTL
- Ability to include working directory et al. as context for the cache key.
There do appear to be downsides (from my PoV) as well:
- It's a rust program, so it needs to be compiled (memo is a bash/zsh script and runs as-is).
- There's no mention of transparent compression, either in the README or through simple source code search. I did find https://github.com/dimo414/bkt/issues/62 which mentions swappable backends. The fact that it uses some type of database instead of just the filesystem is not a positive for me, I prefer the state to be easy to introspect with common tools. I will often memo commands that output gigabytes of data, which is usually highly compressible. Transparent compression fixes that up. One could argue this could be avoided with a filesystem-level feature, like ZFS transparent compression. But I don't know how to detect that in a cross-FS fashion.
The default storage location for memo(1) output is /tmp/memo/${USER}. Most distributions either have some automatic periodic cleanup, and/or wipe it on restart.
Separately from that:
- The invocation contains *memo* right in there, so you (the user) knows that it might memoize.
- One uses memo(1) for commands that are generally slow. Rerunning your command that has a slow part and having it return in a millisecond while you weren't expecting it should make the spider-sense tingle.
In practice, this has never been a problem for me, and I've used this hacked together command for years.
i see no way to name the memo in your examples, so how do you refer to them later?
also, this seems a lot like an automated way to write shell scripts that you can pipe to and from. so why not use a shell script that won't surprise anyone instead of this, which might?
In this invocation, a hash (sha512) is taken of "my-complex-command --some-flag my-positional-arg-1", which is then stored in /tmp/memo/${USER}/{sha512hash}.zst (if you've got zstd installed, other compression extensions otherwise).
Yes, I know. I should've taken a different example. But it's also realistic in a way. When I'm doing one-offs, I will sometimes take shortcuts like this. I know awk fairly well, and I know enough of jq that I know invoking jq . pretty prints the inbound json on multiple lines. While I know I could create a proper jq expression, the combo will get me there quicker. Similarly I'll sometimes do:
$ awk '...' | grep | ...
Because I'm too lazy to go back to the start of the awk invocation and add a match condition there. If I'm going to save it to a script, I'll clean it up. (And for jq, I gotta be honest that my starting point these days would probably be to show my contraption to an LLM and use its answer as a starting point, I don't use jq nearly enough to learn its language by memory.)
It memoizes the command passed to it.
Manually clearing it (for example if I know the underlying data has changed: In-pipeline memoization (includes the input in the hash of the lookup): This allows me to rapidly iterate on shell pipelines. The main goal is to minimize my development latency, but it also has positive effects on dependencies (avoiding redundant RPC calls). The classic way of doing this is storing something in temporary files: But I find this awkward, and makes it harder than necessary to experiment with the expensive command itself. Both of those will run curl once.NOTE: Currently environment variables are not taken into account when hashing.