Markus Peloquin's Non-blag - Page o' Utilities

$ cat categories
boost.jam
c++
crypto
javascript
latex
misc
openbsd
publicfile
unix

$ _

boost.jam

jam.vim is some pretty syntax highlighting I made for bjam aka Boost.Jam aka Boost.Build aka what-the-hell-do-I-call-this-thing? Anyway, it's an alternative to make, a lot more powerful, and a lot simpler to use. I based this off of something from a certain Matt Armstrong that I found on some mailing list.

c++

sys_fstream.hpp (20090309) is a reimplementation of C++'s <fstream> so that I could get at the file descriptor if I needed it. So if you want to open a file before calling fstat(2), but still want to use C++'s I/O, you can. You can even construct the object using the file descriptor or FILE* pointer. Obviously, since standard C++ I/O uses buffers, as does C I/O, it's a bad idea to use the file descriptors or FILE* pointers for I/O unless you promise to flush buffers before switching to a different I/O system (ostreams only). One usage example is if the following were running set-uid root, the user wouldn't be able to dump /etc/shadow by exploiting a race condition:
```
#include <iostream>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include "sys_fstream.hpp"
int main(int argc, char **argv)
{
	if (argc < 2) {
		std::cerr << "need arg\n";
		return 1;
	}
	std::extra::sys_ifstream in(argv[1]);
	if (!in) {
		std::cerr << "failed to open\n";
		return 1;
	}
	struct stat st;
	fstat(in.fd(), &st);
	uid_t uid = getuid();
	if (uid && uid != st.st_uid) {
		// don't bother checking gids in this example
		std::cerr << "you don't have permission\n";
		in.close();
		return 1;
	}
	std::cout << "access granted:\n=============\n";
	std::string line;
	while (std::getline(in, line)) std::cout << line << '\n';
	return 0;
}
```
fibheap.hpp (20091119) An implementation of the Fibonacci heap as described in Cormen et al., with a superset of STL's std::priority_queue's operations (the new functions being pop(pointer) and decrease(pointer, key)). The only deficiencies are that the copy constructor does not keep the old heap structure (the complexity is the same, but there is some wasted work), and there is no merge operation.
redblack.hpp (20091110) An implementation of Red-black trees as described in Cormen et al., with an almost identical interface as STL's std::multimap. I totally needed this for computational geometry, but couldn't get it working. All of that has changed today, my friends. So … I hope it works. I have (disabled) consistency checks implemented that went quiet a while ago, so it really should be fine.

crypto

fluks is my implementation of the LUKS (Linux Unified Key Setup) standard. With LUKS there is a master key, used to encrypt a disk partition, and this master key is encrypted in the LUKS header with a much simpler password. I use this on my desktop system. This is the largest thing I have ever written.

Included in fluks is a C99 implementation of CAST6 or CAST-256, described in RFC 2612. It is licensed with ISC and is therefore the only unencumbered C implementation that I know of. I won't call this optimized, but you'd be hard-pressed to make it much faster, I bet. CAST-256 was not an AES finalist.

There's also a C99 implementation of the Tiger hash, using the OpenSSL-style init(), update(), end() interface. License: ISC. The reference version had no update() procedure. Mine is also heavily annotated. I should note that it hasn't been tested on big-endian systems, but I have faith that it will work. The Tiger hash is designed to run well on 64-bit systems.

It is also the home of an independent Serpent implementation, written in C++ (for the references) and accessible to C (at minimum, link with libsupc++). License: ISC. It is generally efficient as it uses bitslicing (with very short S-box functions), but I haven't benchmarked it. There is a faster, GPL-encumbered version, but it relies on superscalars and CISC architectures. Serpent was an AES finalist, edged out by Rijndael because it was harder/slower to implement. It is more secure, though (as far as anybody really knows), if only because it runs slower.

Let's see, what else? I optimized the reference implementation of the Whirlpool hash, and it's now C99 with zero macros. It no longer allows hashing of partial bytes of data. The alignment issues involved are just too much.
javascrypto is a Javascript implementation of several ciphers (just Serpent), hash functions (Tiger and Whirlpool), HMAC (hashed MAC), PBKDF2 (password-based key derivation function), and CTR encryption. Please don't ignore the fact that you cannot trust unsigned crypto implementations, especially over the internets. Who knows what's changed? I only wrote all this for my email page.
flacsplit Splits a FLAC or WAVE file based off a corresponding CUE sheet, compressing the output to FLAC and tagging it from the CUE sheet values. It does everything in memory, saving a lot in I/O costs, so it's fast.

javascript

strftime.js strftime(3) in Javascript. 8.4 kB plain, 3.7 kB minified. I haven't tested the week logic, but it seems good!

latex

ntg9.tar.gz There may have been a way to generate this, but … anyway … I wanted to use the artikel3 document class in LaTeX, but I wanted the font size to be 9 point. I basically copied the sizes from the extarticle's size9.clo file and … sort of interpolated between ntg's ntg12.clo, ntg11.clo, and ntg10.clo files. I haven't the foggiest how it works.

misc

smallworld.tar.xz (not a utility). I got really frustrated by a certain puzzle from the old Facebook job puzzles. The test code for it is really screwy. I.e. they contrived tests that would not work for implementations using floats. Now that's not memory efficient, which is what they wanted. Well out of protest for this screwy puzzle, I'm posting my code. I licensed it under MIT. Successful evaluations get sent to the Facebook people, so they should see my name in it if you submit it legally :). I now know that it can be solved in O(n lg n) time using an algorithm involving Delaunay triangles.
sudoku.tar.xz (20091123) A sudoku solver I wrote for 9x9 and 16x16 puzzles. It's the only way I'll ever have the patience to solve a 16x16 sudoku. I did it once for real, and that was enough. I pronounce it soo-DOCK-oo. I'm probably wrong.

openbsd

I run OpenBSD-current, and I hate maintaining it. These make my life loads easier.

clean-tree.c (20090607) is a utility that I place in /usr/ports to remove all of the w-* directories, used to build packages. It is obsolete now that ports puts everything in /usr/ports/obj. This couldn't be written as a shell script since using find would traverse the w-* directories, even though I don't care what's in there. So, it was down to Python or C (I didn't yet know Perl). For the challenge, I chose C. To build it, just cd to the directory with it and type CFLAGS="$CFLAGS -std=c99" make clean-tree.
cvs-update (20080901) is a Python script that updates the CVS repositories. It also rebuilds /usr/ports/INDEX, which is a necessary step when using build-world below. It should be configured by editing the variables near the top.
build-world-20100128.tar.xz is a program in Perl that builds all packages from a file called 'world'. Each line of 'world' is a package name, possibly with a partial version (e.g. gkrellm, jdk-1.7). When you give it a --update argument, it will rebuild all packages that have updates. Also with --update, if the version has increased for a package (not the OpenBSD patch number), anything depending on it will be rebuilt. In the future, I will add support for setting FLAVOR in the 'world' file. Right now, I'm just setting FLAVOR at the top of the port Makefiles. Sloppy, I know.

As a side-note, I rewrote this in Python for fun. It used less lines, but that's probably because there weren't any lines with a lone closing brace like is common in Perl. Also, every line was longer, so it ended up being bigger. It executed a second faster in the --pretend mode (5 vs 6). It was also much clearer, since Python has a primarily object-oriented API where Perl's is procedural.

publicfile

Here are two patches I wrote for publicfile, a simple and secure http/ftp server from D. J. Bernstein.

publicfile-0.52-sorted.patch sorts the ftp listings before they're sent out. It uses a binary search tree, because that's what I wanted to write.
20091123 Changed formatting (OpenBSD KNF!), and made the implementation not stupid-complicated, and a little faster, I'm sure.
publicfile-0.52-allowspace.patch allows spaces to appear in filenames. I cannot think of any reason not to, and it is possibly a bug in DJB's code. It's doubtful, but the only fix I needed for allowing spaces was the difference between < and <=. Now I usually frown on spaces within filenames, but with my music collection, I make an exception. Did I just say I have my music files hosted with FTP on my various computers? No, I clearly did not say that.

unix

bestzip.tar.xz compresses a file with gzip and lzma, and keeps whichever gave better compression. I ran this in the directory where I keep my Windows software on my fileserver and saved probably 100MB. Written in C++, boost for the program_options library and the scoped_array type, and SUSv3 for exec (mainly). Also requires gzip and lzma-utils.
dirsize-20100314.tar.xz recursively sums the sizes and contained sizes of all arguments. Written in C++ with boost for the filesystem library, so it's totally cross-platform :).
rescue-20090202.tar.xz is a tool that I developed to recover from a catostrophic loss of data. The lovely Linux developers developed ext4 for years. One day, they announce, 'it's stable!' So I threw caution to the wind and made the switch. Biggest mistake ever. At some point, the root directory was erased (as in, size = 0). Other problems I encountered along the way: the group descriptor table is gone, so I wrote a program to guess, for each group, what block the inodes started at (powers of 3, 5, and 7 are different from the others); I can't seem to find any inodes in the first group; and the ext4 documentation == code.

So, I set out on a quest to recover my data from the evil clutches of ext4, armed with nothing but cygwin, vim, hexedit, python, C, and my wits. This is the culmination of all my ext4 knowledge. Although not perfect, it got all 23 GB of my data back at a stellar 1 MB/s (I figure if I had made the I/O asynchronous, it would have been quicker). Also, the program writes a log that contains the permissions of all files, as well as the symbolic links, since none of this can be done with FAT32 filesystems. There is a tool called run_log that will process the log, applying the permissions and recreating the symlinks. Written in C99 for little-endian systems (I'm not sure how extfs behaves on big-endian systems).

I now use XFS.
multicopy.tar.xz copies a file to multiple destinations. Useful when more than one device is involved. Runs at the speed of the slowest device, and relies on the OS for asynchronous writes.

[up one level]

This page last edited out of boredom.

Support me for some reason? LgTxEaxzhjjrQPqpzZ9JjDGAm8Wjq6vHXJ
Here's hoping to many long centuries of stable and uncontrollable currency.