Gergely Orosz

Developer advice to self

Gergely Orosz shared advice that he’d give to himself 10 years ago. It’s interesting how hindsight is always 20/20…it’s easier to connect the dots looking back vs looking forward.

As I look back to over a decade ago, there are a few things I wish I’d started doing sooner. Habits that could have helped made me grow faster and in a more focused way. This is the advice I’d give my younger self, who has just landed their first professional software engineering job.

1. Take the time to read two books per year on software engineering … Every time I took the time to slowly and thoroughly read a recommended book on software engineering, I leveled up. By properly reading, I mean taking notes, talking chapters through with others, doodle diagrams, trying out, going back, and re-reading…

Maxime Vaillancourt

Automatically labeling GitHub notification emails with Gmail filters

Maintaining a GitHub project with other people creates “many email notifications about various things.” But they don’t all hold the same importance. Maxime Vaillancourt shows us how to use Gmail filters and labels to better manage all the emails coming from GitHub issues, etc.

I receive many email notifications about various things that happen on there: direct requests to review a particular piece of code, feedback on pull requests I’ve opened, pull requests merged by their authors, people directly mentioning our username in a comment, issues closed by their authors, etc. I receive hundreds of emails every single week.

…using Gmail filters, we can automatically add labels to GitHub notification emails based on their content. This solution takes less than 10 minutes to implement, and the long-term return on investment is quite appreciable.

Automatically labeling GitHub notification emails with Gmail filters

Morris Brodersen

A case study on vanilla web development

It’s a TeuxDeux clone in plain HTML, CSS and JavaScript. More importantly, it’s a case study showing that vanilla web development is viable in terms of maintainability, and worthwhile in terms of performance.

There’s no custom framework invented here. Instead, the case study was designed to find minimum viable patterns that are truly vanilla (see the rules). The result is maintainable, albeit verbose and with considerable duplication in certain areas.

If anything, it shows the value frameworks provide, but also highlights how effective standard web technologies can be used.

Heroku Icon Heroku – Sponsored

🎧 Processing large datasets with Python

logged by @logbot permalink

From Heroku’s Code[ish] podcast, Greg Nokes and special guest JT Wolohan talk about Python in a large dataset world. Bonus — they share a 40% discount code for JT’s book!

Python is familiar to most developers as a high-level scripting language that’s popular in scientific communities. But some of its main benefits include the data processing ecosystem that’s been built around it. In particular, the machine learning communities, coupled with its lightweight asynchronous frameworks, have brought a new interest in how Python works with massive datasets.

J.T. Wolohan, the author of “Mastering Large Datasets with Python,” joined Greg Nokes, Master Technical Architect at Heroku, to talk about the application of Python and massive datasets.

Command line interface

An intuitive CLI for processing video (powered by ffmpeg)

ffmpeg is an incredibly powerful tool, but its many flags and options make it not the easiest thing to wield (especially if you use it just infrequently enough to forget the magic syntax you ginned up last time).

vdx makes ffmpeg more approachable for many of the common video processing operations you may need on a regular basis. Examples!

$ vdx '*.mov' --crop=360,640    # Crop to width 360, height 640
$ vdx '*.mov' --format=gif      # Convert to GIF
$ vdx '*.mov' --fps=12          # Change the frame rate to 12
$ vdx '*.mov' --no-audio        # Strip audio
$ vdx '*.mov' --resize=360,-1   # Resize to width 360, maintaining aspect ratio
$ vdx '*.mov' --reverse         # Reverse
$ vdx '*.mov' --rotate=90       # Rotate 90 degrees clockwise
$ vdx '*.mov' --speed=2         # Double the speed
$ vdx '*.mov' --trim=0:05,0:10  # Trim from time 0:05 to 0:10
$ vdx '*.mov' --volume=0.5      # Halve the volume

Linode Icon Linode – Sponsored

Understanding Kubernetes: A guide to modernizing your cloud infrastructure

logged by @logbot permalink

Learn fundamental concepts of Kubernetes, from the components of a Kubernetes cluster to network model implementation. After reading this guide, you’ll have a working knowledge of containers and be able to jump right in and deploy your first Kubernetes cluster.

This is a free guide and available as an instant download with no registration required.

Start on Linode today and receive $100 in credit.


A Go unikernel running on x86 bare metal

Run a single Go applications on x86 bare metal, written entirely in Go (only a small amount of C and some assembly), support most features of Go (like GC, goroutine) and standard libraries, also come with a network stack that can run most net based libraries.

The entire kernel is a go application running on ring0. There are no processes and process synchronization primitives, only goroutines and channels. There is no elf loader, but there is a Javascript interpreter that can run js script files, and a WASM interpreter will be added to run WASM files later.

Goroutines correspond to processes and channels are used for inter-process communication (IPC). Also it runs JavaScript ¯\(ツ)

Ruurtjan Pul – an online tool for exploring DNS records

Ruurtjan Pul writes:

It’s been my side project for the past half year. In contrast to existing alternatives, my aim is for it to be simple, powerful, user-friendly. I’ll be adding more features the coming time, but it should be useful as is already.

I ran a few test lookups to kick the tires and the site is fast, simple, and displays the information in an easily digestible format. Worth a bookmark!


Keel is a tool for automating Kubernetes deployment updates

kubectl is the new SSH. If you are using it to update production workloads, you are doing it wrong. See examples on how to automate application updates.

We’re using this in our new Kubernetes-based infrastructure (more details on that coming to a podcast near you). Keel runs as a single container, scanning Kubernetes and Helm releases for outdated images. Super cool stuff, and even has a web interface (which we’re not using yet, but should).

Keel is a tool for automating Kubernetes deployment updates

Machine Learning

The case for a learned sorting algorithm

Adrian Colyer walks us through a paper from SageDB that’s taking machine learning and applying it to old Computer Science problems such as sorting. Here’s the big idea:

Suppose you had a model that given a data item from a list, could predict its position in a sorted version of that list. 0.239806? That’s going to be at position 287! If the model had 100% accuracy, it would give us a completed sort just by running over the dataset and putting each item in its predicted position. There’s a problem though. A model with 100% accuracy would essentially have to see every item in the full dataset and memorise its position – there’s no way training and then using such a model can be faster than just sorting, as sorting is a part of its training! But maybe we can sample a subset of the data and get a model that is a useful approximation, by learning an approximation to the CDF (cumulative distribution function).


Firefox Reader View as a Linux CLI

Command line tool to extract the main content from a webpage, as done by the “Reader View” feature of most modern browsers. It’s intended to be used with terminal RSS readers, to make the articles more readable on web browsers such as lynx. The code is closely adapted from the Firefox version and the output is expected to be mostly equivalent.

I could see this fitting in nicely in a pipeline between curl and, well, lots of other commands.

Practical AI Practical AI #109

When data leakage turns into a flood of trouble

Rajiv Shah teaches Daniel and Chris about data leakage, and its major impact upon machine learning models. It’s the kind of topic that we don’t often think about, but which can ruin our results. Raj discusses how to use activation maps and image embedding to find leakage, so that leaking information in our test set does not find its way into our training set.

0:00 / 0:00