The New Stack Icon The New Stack

How I built an on-premises AI training testbed with Kubernetes and Kubeflow

This is part 4 in a cool series on The New Stack exploring the Kubeflow machine learning platform.

I recently built a four-node bare metal Kubernetes cluster comprising CPU and GPU hosts for all my AI experiments. Though it makes economic sense to leverage the public cloud for provisioning the infrastructure, I invested a fortune in the AI testbed that’s within my line of sight.

The author shares many insights into the choices he made while building this dream setup.

How I built an on-premises AI training testbed with Kubernetes and Kubeflow

RudderStack Icon RudderStack – Sponsored

Identity graph and identity resolution in SQL

logged by @logbot permalink

In this post we show you how to achieve ID mapping in a data warehouse efficiently using SQL. This, however, scratches the surface of the problem of ID mapping. People have worked on developing sophisticated probabilistic techniques to associate IDs using statistical and machine learning approaches. The data warehouses themselves are adding in-house machine learning capabilities.

Stay tuned, we’ll explore that in the future.


Disadvantages of Pull Requests

In this post, Tomas Wróbel lays out 10 potential drawbacks to the typical PR flows:

  1. More long living branches, more merge conflicts
  2. The reviewability of a change decreases with size
  3. Short feedback loop makes programming fun
  4. Reviews tend to be superficial
  5. Merging is blocked by remarks that shouldn’t be blocking
  6. It’s easier to fix than to explain the fix
  7. Developers are slower to adapt the responsibility mindset
  8. PRs discourage continuous refactoring
  9. Negative emotions and outright pathology
  10. How do you switch to branches with migrations


Am I FLoCed?

The EFF launched a new site you can use to see if your Chrome install is one that Google is testing FLoC on.

Google is running a Chrome “origin trial” to test out an experimental new tracking feature called Federated Learning of Cohorts (aka “FLoC”). According to Google, the trial currently affects 0.5% of users in selected regions, including Australia, Brazil, Canada, India, Indonesia, Japan, Mexico, New Zealand, the Philippines, and the United States.

They also do a nice job describing exactly what FLoC is and what it might mean regarding your privacy online. Icon

Reasons I use the git cherry-pick command

Here is an example to help you understand the importance of cherry-picking. Suppose you have made several commits in a branch, but you realize it’s the wrong branch! What do you do now? Either you repeat all your changes in the correct branch and make a fresh commit, or you merge the branch into the correct branch. Wait, the former is too tedious, and you may not want to do the latter. So, is there a way? Yes, Git’s got you covered.

I’m a pretty big fan of cherry-pick, too. I don’t use it often, but every time I do… 👨‍🍳💋

O'Reilly Media Icon O'Reilly Media – Sponsored

The Manager's Path (free book chapter)

logged by @logbot permalink

Get Chapter 3 from The Manager’s Path free. If you’re a tech lead—or are responsible for promoting someone to fill that role—this chapter’s for you. It dives into what a tech lead does, how the job should be structured, how to manage projects, and most importantly, what makes a tech lead successful.

Oh, and that weird trick we mentioned? It’s on page 4 of this free download.


Nix is the ultimate DevOps toolkit

At Channable we use Nix to build and deploy our services and to manage our development environments. This was not always the case: in the past we used a combination of ecosystem-specific tools and custom scripts to glue them together. Consolidating everything with Nix has helped us standardize development and deployment workflows, eliminate “works on my machine”-problems, and avoid unnecessary rebuilds. In this post we want to share what problems we encountered before adopting Nix, how Nix solves those, and how we gradually introduced Nix into our workflows.

If Nix is intriguing to you, you’re going to love an upcoming episode of The Changelog. 😉

Command line interface

fselect – find files with SQL-like queries

This doesn’t aim to entirely replace find and ls, but if you already know SQL (like many of us do), why not be able to leverage that knowledge for your more advanced file-finding needs? Here’s a couple of examples so you get the idea:

Find temporary or config files (full path and size):

fselect size, path from /home/user where name = '*.cfg' or name = '*.tmp'

Use aggregate functions:

fselect "MIN(size), MAX(size), AVG(size), SUM(size), COUNT(*) from /home/user/Downloads"

Find by date and time intervals:

fselect path from /home/user where modified gte 2017-05-01


Apple releases a collection of Swift data structure implementations

Karoy Lorentey with the announcement:

The Swift Standard Library currently implements the three most essential general-purpose data structures: Array, Set and Dictionary. These are the right tool for a wide variety of use cases, and they are particularly well-suited for use as currency types. But sometimes, in order to efficiently solve a problem or to maintain an invariant, Swift programmers would benefit from a larger library of data structures.

We expect the Collections package to empower you to write faster and more reliable programs, with less effort.

This joins the Swift Algorithms and Swift Numerics packages in what is becoming a valuable, open source resource for Swift developers around the world to use.

InfoQ Icon InfoQ

Crystal goes 1.0

Congrats to the entire Crystal team and community on the big One O!

Crystal, a new object-oriented, compiled systems programming language that aims to blend the conciseness and friendliness of Ruby with the efficiency of C, recently released its first major version. Crystal 1.0 has a syntax close to Ruby’s and features statically inferred types, C bindings, and macros. Crystal may attract developers with a Ruby/Rails, Elixir/Phoenix background.

This has been a long time in the making. Can you believe it’s been five years since we had Ary and Juan on The Changelog? On that episode we discussed what it would take to get Crystal to 1.0…

The Changelog The Changelog #435

The future of the web is HTML over the wire

This week we’re joined by long-time web developer Matt Patterson. Earlier this year Matt wrote an evocative article for A List Apart called The Future of Web Software Is HTML-over-WebSockets. In this episode Matt sits down with Jerod to discuss, in-detail, why he believes the future of the web is server-rendered (again) and how Ruby on Rails is well positioned to bring that future to us today.


SCOTUS declares Google's copying of the Java SE API fair use

In a copyright decision that will undoubtedly have ripple effects on the software industry for years to come, the Supreme Court of the United States held that:

Google’s copying of the Java SE API, which included only those lines of code that were needed to allow programmers to put their accrued talents to work in a new and transformative program, was a fair use of that material as a matter of law.

This quote pulled from the linked opinion by a hacker news commenter drives right in to the heart of the matter:

“Google copied approximately 11,500 lines of declaring code from the API, which amounts to virtually all the declaring code needed to call up hundreds of different tasks. Those 11,500 lines, however, are only 0.4 percent of the entire API at issue, which consists of 2.86 million total lines. In considering “the amount and substantiality of the portion used” in this case, the 11,500 lines of code should be viewed as one small part of the considerably greater whole. As part of an interface, the copied lines of code are inextricably bound to other lines of code that are accessed by programmers. Google copied these lines not because of their creativity or beauty but because they would allow programmers to bring their skills to a new smartphone computing environment.”

0:00 / 0:00