8 Tudor Dr.
Endicott, NY 13760
Standardized Redash deployments by automatically tagging each months release candidate, and testing release images against the Starfish stack. Lightened internal maintenance by reviewing outside contributions and integrating patches with the upstream project.
Implemented a method of expiring database records to allow each volume under management to efficiently remove history that is no longer needed. Documented the tradeoffs and best practices for repacking/reindexing databases as large as 30TB in size.
Extended the capabilities of Redash to allow for informative email alerts. Developed periodic queries to provide trend information about storage under management.
Designed a tool for sending storage reports over email by rendering Redash dashboards as PDFs along with raw data from individual queries. Developed a configuration format which enabled dashboard parameters to be customized for each recipient. StarfishStorage/redash-email was released as an open source project in 2024.
Implemented Autofailover for PostgreSQL using Consul and a custom health check daemon which enabled nearest_n service discovery. Extended existing configuration management to handle provisioning. Used Consul Template to redirect exiting connections on PgBouncer after a leader is promoted.
Collaborated on Anton 3 task scheduling
using Slurm. Ensured that logical Anton submachines could be
effectively reconfigured by allocating
frontend
resources on Kubernetes.
Creator of a REST service and Web UI that provides access to simulation data based on group membership. Authentication tokens obtained using Kerberos or MUNGE. Used PostgreSQL logical replication to subscribe to upstream data.
Designed a service for monitoring database availability characteristics such as connect times, recovery status and disruption of idle connections.
Reorganized PostgreSQL administration using a set of scripts after finding that the learning curve of SaltStack was prohibitive for other team members. Database relationships and major versions were defined in a single YAML file.
Established database roles for all Anton services and created a schema layout using minimum set of privileges for each role. Institutionalized a mechanism for retrieving database credentials, and adapted test harnesses to use the common schema.
Implemented a test harness for cluster provisioning using an SQLAlchemy-derived schema. Accomplished progressive migration from MySQL by periodically refreshing a PostgreSQL instance for read-only services.
Release engineer for Desjob, used to create and start simulations on Anton and Anton 2. Automatic failure recovery and data-driven error classification enabled simulations to survive faulty hardware.
Primary author and maintainer of AMSv2, an application server used to provide administrative control over the logical and physical components of Anton 2. Improved performance characteristics over time by restructuring service methods around tests to verify that the object/relational mapper was generating correct queries.
Conducted approximately 140 phone screens and in-house interviews for positions in Operations, System Software, System Administration, and Scientific Software. In all cases care was taken to produce an essay describing a candidate's potential and fit for a given position.
Key developer for AMS, an application server used to provide record keeping, administrative control, and simulation state for Anton. Provided the Operations team with a user interface for visualizing submachine and queue utilization.
Assisted with maintenance of 1500-node Linux cluster by routinely diagnosing and replacing faulty hardware. Facilitated the installation and functional testing of Eighteen 512-node Anton supercomputers.
Handled all technical aspects of running a regional ISP, teisprint.com. Provided support and generated documentation for 8 dedicated T1/ISDN accounts, 1300 dial-up users, 2100 email addresses. Executed a migration of all services and equipment to a new collocation facility during an acquisition of ezaccess.net.
Enabled the business office to provide support to email and dialup customers by implementing a complete account management interface along with tutoring on basic troubleshooting techniques using command line utilities.
Designed an in-house ticket management system aimed at improving visibility on outstanding issues. Substantially improved the effectiveness of a small team by making progress visible.
Provided on-site consulting in voice and data networking to an average of sixty companies in Northeastern Pennsylvania.
2013 |
BSD Associate recertification
Philadelphia PA |
2009 |
BSD Associate from the BSD Certification Group
Washington D.C. |
1997 |
Troubleshooting, Maintaining, and Repairing Personal Computers
Open book exam |
Minimalist Scripted Configuration,
Agent-Server, Gather-Fact, and Remote Execution
NYCBUG, June 2, 2021
Single-Minded Configuration,
A Stabilized Approach to Systems Orchestration
Recording, December 19, 2020
Overcoming First Principles,
A guide for accessing the features of PostgreSQL in test-driven
development
PGConfUS,
April 19, 2016
Learning Through Composition,
A study in building modern Unix tooling
NYCBUG,
January 13, 2015
A web service which allows users to become a contributor to web pages by selecting text and submitting comments. User comments create a ticket which provides site operators the opportunity to refine and incorporate feedback.
Project page for rset(1)/pln(5), an approach to configuring remote systems using common scripting languages and tools.
A site to showcase some of my work in analog photography, while providing others with an explanation of techniques that I have acquired in using cameras that do not have advanced automation.
Project page for pg_tmp(1), of a tool for quickly spinning up temporary PostgreSQL databases. Crafted to give unit tests full access to the capabilities of the database.
Project page for entr(1), a cross-platform tool for running arbitrary commands when files change. Crafted to promote rapid feedback and automated testing.
Tutorials and commentary on programming, networking, databases, and administration of Unix-like systems.
Last updated on January 21, 2025