Eric Shane Radman

8 Tudor Dr.
Endicott, NY 13760

Professional Experience

8/2022 – Present Starfish Storage

Systems Engineer, Remote

Extended the capabilities of Redash to allow for nicely formatted alerts. Developed periodic queries to provide trend information about storage under management.

Implemented commit hooks to improve code quality by warning of formatting inconsistencies as well as finding and fixing common mistakes.

Designed a tool for sending storage reports over e-mail by rendering Redash dashboards as PDFs. Developed a configuration format which enabled dashboard parameters to be customized for each recipient. Raw data from individual queries could also be generated and attached using custom parameters.

Documented strategies for using Citus Data to distribute storage and processing of tables greater than 5TB.

6/2010 – 7/2022 D. E. Shaw Research, LLC

Systems Administrator, Endicott NY

Implemented Autofailover for PostgreSQL using Consul and a custom health check daemon which enabled nearest_n service discovery. Extended existing configuration management to handle provisioning. Used Consul Template to redirect exiting connections on PgBouncer after a leader is promoted.

Collaborated on Anton 3 task scheduling using Slurm. Ensured that logical Anton submachines could be effectively reconfigured by allocating frontend resources on Kubernetes.

Creator of a REST service and Web UI that provides access to simulation data based on group membership. Authentication tokens obtained using Kerberos or MUNGE. Used PostgreSQL logical replication to subscribe to upstream data.

Designed a service for monitoring database availability characteristics such as connect times, recovery status and disruption of idle connections.

Reorganized PostgreSQL administration using a set of scripts after finding that the learning curve of SaltStack was prohibitive for other team members. Database relationships and major versions were defined in a single YAML file.

Established database roles for all Anton services and created a schema layout using minimum set of privileges for each role. Institutionalized a mechanism for retrieving database credentials, and adapted test harnesses to use the common schema.

Implemented a test harness for cluster provisioning using an SQLAlchemy-derived schema. Accomplished progressive migration from MySQL by periodically refreshing a PostgreSQL instance for read-only services.

Release engineer for Desjob, used to create and start simulations on Anton and Anton 2. Automatic failure recovery and data-driven error classification enabled simulations to survive faulty hardware.

Primary author and maintainer of AMSv2, an application server used to provide administrative control over the logical and physical components of Anton 2. Improved performance characteristics over time by restructuring service methods around tests to verify that the object/relational mapper was generating correct queries.

Conducted approximately 140 phone screens and in-house interviews for positions in Operations, System Software, System Administration, and Scientific Software. In all cases care was taken to produce an essay describing a candidate's potential and fit for a given position.

3/2009 – 6/2010 D. E. Shaw Research, LLC

Systems Contractor, Endicott NY

Key developer for AMS, an application server used to provide record keeping, administrative control, and simulation state for Anton. Provided the Operations team with a user interface for visualizing submachine and queue utilization.

Assisted with maintenance of 1500-node Linux cluster by routinely diagnosing and replacing faulty hardware. Facilitated the installation and functional testing of Eighteen 512-node Anton supercomputers.

10/2001 – 7/2009 Telecommunication Enterprises Inc

Network Administrator, Clarks Summit PA

Handled all technical aspects of running a regional ISP, Provided support and generated documentation for 8 dedicated T1/ISDN accounts, 1300 dial-up users, 2100 e-mail addresses. Executed a migration of all services and equipment to a new collocation facility during an acquisition of

Enabled the business office to provide support to e-mail and dialup customers by implementing a complete account management interface along with tutoring on basic troubleshooting techniques using command line utilities.

Designed an in-house ticket management system aimed at improving visibility on outstanding issues. Substantially improved the effectiveness of a small team by making progress visible.

Provided on-site consulting in voice and data networking to an average of sixty companies in Northeastern Pennsylvania.


2013 BSD Associate recertification
Philadelphia PA
2009 BSD Associate from the BSD Certification Group
Washington D.C.
1997 Troubleshooting, Maintaining, and Repairing Personal Computers
Open book exam

Talks and Presentations

Minimalist Scripted Configuration, Agent-Server, Gather-Fact, and Remote Execution
NYCBUG, June 2, 2021

Single-Minded Configuration, A Stabilized Approach to Systems Orchestration
Recording, December 19, 2020

Overcoming First Principles, A guide for accessing the features of PostgreSQL in test-driven development
PGConfUS, April 19, 2016

Learning Through Composition, A study in building modern Unix tooling
NYCBUG, January 13, 2015

Knowledge Sharing and Open-Source

2021 – Present

A web service which allows users to become a contributor to web pages by selecting text and submitting comments. User comments create a ticket which provides site operators the opportunity to refine and incorporate feedback.

2018 – Present

Project page for rset(1)/pln(5), an approach to configuring remote systems using common scripting languages and tools.

2018 – Present

A site to showcase some of my work in analog photography, while providing others with an explanation of techniques that I have acquired in using cameras that do not have advanced automation.

2014 – Present

Project page for pg_tmp(1), of a tool for quickly spinning up temporary PostgreSQL databases. Crafted to give unit tests full access to the capabilities of the database.

2012 – Present

Project page for entr(1), a cross-platform tool for running arbitrary commands when files change. Crafted to promote rapid feedback and automated testing.

1999 – Present

Tutorials and commentary on programming, networking, databases, and administration of Unix-like systems.

Last updated on March 23, 2023