IMTI - Craig Johnston

Linear Algebra: Singular Value Decomposition

Linear Algebra Crash Course for Programmers Part 10

This article covers Singular Value Decomposition (SVD), part ten of the series. SVD is arguably the most important matrix decomposition, with applications in image compression, recommender systems, pseudoinverse computation, and dimensionality reduction.

Posted by Craig Johnston Monday, April 20, 2020

Linear Algebra: Least Squares and Regression

Linear Algebra Crash Course for Programmers Part 9

This article covers least squares and regression, part nine of the series. Least squares is one of the most important applications of linear algebra and forms the foundation of regression analysis used throughout data science and machine learning.

Posted by Craig Johnston Saturday, February 15, 2020

Linear Algebra: Orthogonality and Projections

Linear Algebra Crash Course for Programmers Part 8

This article covers orthogonality and projections, part eight of the series. Orthogonality is fundamental to many algorithms including least squares regression, QR decomposition, and machine learning techniques like PCA.

Posted by Craig Johnston Tuesday, December 10, 2019

Linear Algebra: Eigenvalues and Eigenvectors Part 2

Linear Algebra Crash Course for Programmers Part 7

This article continues the exploration of eigenvalues and eigenvectors, focusing on diagonalization, computing matrix powers, and handling complex eigenvalues. Part seven of the series.

Posted by Craig Johnston Saturday, October 5, 2019

Linear Algebra: Eigenvalues and Eigenvectors Part 1

Linear Algebra Crash Course for Programmers Part 6

This article on eigenvalues and eigenvectors is part six of an ongoing crash course on programming with linear algebra. Eigenvalues and eigenvectors are among the most important concepts in linear algebra, with applications ranging from differential equations to machine learning algorithms like PCA.

Posted by Craig Johnston Tuesday, July 30, 2019

Linear Algebra: Vector Spaces and Subspaces

Linear Algebra Crash Course for Programmers Part 5

This article on vector spaces and subspaces is part five of an ongoing crash course on programming with linear algebra, demonstrating concepts and implementations in Python. Vector spaces provide the theoretical framework for understanding linear algebra, while subspaces help us analyze the structure of matrices and linear transformations.

Posted by Craig Johnston Saturday, May 25, 2019

Linear Algebra: Matrix Inverses and Determinants

Linear Algebra Crash Course for Programmers Part 4

This article on matrix inverses and determinants is part four of an ongoing crash course on programming with linear algebra, demonstrating concepts and implementations in Python. The inverse of a matrix and the determinant are fundamental concepts that reveal important properties about matrices and provide alternative methods for solving systems of linear equations.

Posted by Craig Johnston Wednesday, March 20, 2019

Linear Algebra: Systems of Linear Equations

Linear Algebra Crash Course for Programmers Part 3

This article on systems of linear equations is part three of an ongoing crash course on programming with linear algebra, demonstrating concepts and implementations in Python. We’ll explore how matrices provide a powerful framework for solving systems of equations, a fundamental problem that appears throughout science, engineering, and machine learning.

Posted by Craig Johnston Tuesday, January 15, 2019

Linear Algebra: Matrices

Linear Algebra Crash Course for Programmers Part 2a

This article on matrices is part two of an ongoing crash course on programming with linear algebra, demonstrating concepts and implementations in Python. The following examples will demonstrate some of the various mathematical notations and their corresponding implementations, easily translatable to any programming language with mature math libraries.

Posted by Craig Johnston Saturday, December 8, 2018

Linear Algebra: Vectors

Crash Course for Python Programmers Part 1

This article on vectors is part of an ongoing crash course on linear algebra programming, demonstrating concepts and implementations in Python. The following examples will demonstrate some of the algebraic and geometric interpretations of a vector using Python. A vector is an ordered list of numbers, represented in row or column form.

Posted by Craig Johnston Thursday, November 1, 2018

Kafka on Kubernetes

Deploy a highly available Kafka cluster on Kubernetes.

Kafka is a fast, horizontally scalable, fault-tolerant, message queue service. Kafka is used for building real-time data pipelines and streaming apps.

Posted by Craig Johnston Tuesday, September 25, 2018

Ethereum Ethstats

Learning the Ethereum Blockchain through its metrics.

The eth-netstats project provides a great dashboard interface for monitoring the status of an Ethereum Blockchain from the perspective of its nodes. The website https://ethstats.net/ reports statistics from an extensive list of Ethereum nodes on the public Ethereum Blockchain, however, the eth-netstats software that drives https://ethstats.net/ can also be used to monitor a Private Ethereum Blockchchain as I demonstrate in the previous article Deploy a Private Ethereum Blockchain on a Custom Kubernetes Cluster.

Posted by Craig Johnston Saturday, September 22, 2018

Ethereum Blockchain on Kubernetes

Deploy a Private Ethereum Blockchain on a Custom Kubernetes Cluster.

Blockchain technologies have been made famous by Cryptocurrencies such as Bitcoin and Ethereum. However, the concepts behind Blockchain are far more reaching than their support for cryptocurrency. Blockchain technologies now support any digital asset, from signal data to complex messaging, to the execution of business logic through code. Blockchain technologies are rapidly forming a new decentralized internet of transactions.

Posted by Craig Johnston Tuesday, September 4, 2018

Blockchain

A Conceptual and Motivational Overview

Blockchain (The Internet of Transactions) may be a recent entry to the technology landscape. However, it has quickly become an essential iteration in the evolution of peer-to-peer communication and distributed computing. Originally developed as a way to protect digital currency, Blockchain technologies now support any digital asset, from signal data to complex messaging, to the execution of business logic through code. Blockchain technologies are rapidly forming a new decentralized internet of transactions.

Posted by Craig Johnston Saturday, September 1, 2018

Kubernetes Port Forwarding for Local Development

Using kubefwd

kubefwd helps to enable a seamless and efficient way to develop applications and services on a local workstation. Locally develop applications that intend to interact with other services in a Kubernetes cluster. kubefwd allows applications with connection strings like http://elasticsearch:9200/ or tcp://db:3306 to communicate into the remote cluster. kubefwd can be used to reduce or eliminate the need for local environment specific connection configurations.

Posted by Craig Johnston Saturday, August 11, 2018

FaaS on Kubernetes

Kubeless, Python and Elasticsearch

FaaS or Function as a Service also known as Serverless computing implementations are gaining popularity. Discussed often are the cost savings and each implementations relationship to the physical and network architecture of a specific platform or vendor. While many of the cost and infrastructure advantages of FaaS are compelling, its only one of many advantages. Below, I hope to demonstrate how easy it is to develop and deploy FaaS components into a custom Kubernetes cluster. The functions I develop are nearly all business logic, and I believe therein lies the advantage, high-density business logic. Functions can have a higher degree of focus directly on business logic and communication with other services. Functions can communicate with other functions, microservices or monoliths. In this article, I demonstrate this with Elasticsearch.

Posted by Craig Johnston Saturday, July 28, 2018

Elasticsearch Essential Queries

Getting started with Elasticsearch

The following is an overview for querying Elasticsearch. Over the years I have tried to assemble developer notes for myself and my team on a variety of platforms, languages and frameworks, a type of cheat-sheet but with context, not a comprehensive how-to, but a decent 15-minute overview of the features we are most likely to implement in a given iteration.

Posted by Craig Johnston Thursday, July 26, 2018

Remote Query Elasticsearch on Kubernetes

Local workstation-based microservices development

Developing on our local workstations has always been a conceptual challenge for my team when it comes to remote data access. Local workstation-based development of services that intend to connect to a wide range of remote services that may have no options for external connections poses a challenge. Mirroring the entire development environment is possible in many cases, just not practical.

Posted by Craig Johnston Wednesday, July 25, 2018

High Traffic JSON Data into Elasticsearch on Kubernetes

Instant, reliable, send and forget.

IOT devices, Point-of-Sale systems, application events or any client that sends data destined for indexing in Elasticsearch often need to send and forget, however, unless that data is of low value there needs to be assurance that arrives at its final destination. Back-pressure and database outages can pose a considerable threat to data integrity.

Posted by Craig Johnston Wednesday, July 18, 2018

Kibana on Kubernetes

Visualize your Elasticsearch data.

This guide walks through a process for setting up Kibana within a namespace on a Kubernetes cluster. If you followed along with Production Grade Elasticsearch on Kubernetes then aside from personal or corporate preferences, little modifications are necessary for the configurations below.

Posted by Craig Johnston Sunday, July 15, 2018

Production Grade Elasticsearch on Kubernetes

Setup a fast, custom production grade Elasticsearch cluster.

Installing production ready, Elasticsearch 6.2 on Kubernetes requires a hand full of simple configurations. The following guide is a high-level overview of an installation process using Elastic’s recommendations for best practices. The Github project kubernetes-elasticsearch-cluster is used for the Elastic Docker container and built to operate Elasticsearch with nodes dedicated as Master, Data, and Client/Ingest.

Posted by Craig Johnston Saturday, July 14, 2018

Kubernetes Team Access - RBAC for developers and QA

Role Based Access Control

RBAC (Role Based Access Control) allows our Kubernetes clusters to provide the development team better visibility and access into the development, staging and production environments than it has have ever had in the past. Developers using the command line tool kubectl, can explore the network topology of running microservices, tail live server logs, proxy local ports directly to services or even execute shells into running pods.

Posted by Craig Johnston Tuesday, July 10, 2018

Python Data Essentials - Matplotlib and Seaborn

A beginners guide.

There is an overwhelming number of options for developers needing to provide data visualization. The most popular library for data visualization in Python is Matplotlib, and built directly on top of Matplotlib is Seaborn. The Seaborn library is “tightly integrated with the PyData stack, including support for numpy and pandas data structures and statistical routines from scipy and statsmodels.”

Posted by Craig Johnston Sunday, July 8, 2018

Webpage to PDF Microservice

Automate PDF Report Generation

I create a lot of data visualizations for clients, many of which are internal, portal-style websites that present data in real time, as well as give options for viewing reports from previous time-frames. PDFs are useful for data such as bank statements or any form of time-snapshot progress reporting. It is common for clients to want PDF versions generated on a regular basis for sharing through email or other technologies.

Posted by Craig Johnston Sunday, July 1, 2018

A Microservices Workflow with Golang and Gitlab CI

Continuous Integration & Deployment

Many of the resources on Cloud Native Microservices show you how easy it is to get up and running with AWS or GKE. I think this is great but for the fact that I see a trend (in my clients at least) of associating concepts with particular products or worse, companies. I love Amazon, but it’s not THE cloud). In my opinion, to embrace Cloud Native and Microservices you should develop some, and host them yourself. The cloud is not Google or Amazon; it’s any cluster of virtualized systems, abstracted from their hardware interfaces and centrally managed.

Posted by Craig Johnston Friday, June 22, 2018

Python Data Essentials - Pandas

A data type equivalent to super-charged spreadsheets.

Pandas bring Python a data type equivalent to super-charged spreadsheets. Pandas add two highly expressive data structures to Python, Series and DataFrame. Pandas Series and DataFrames provide a performant analysis and manipulation of “relational” or “labeled” data similar to relational database tables like MySQL or the rows and columns of Excel. Pandas are great for working with time series data as well as arbitrary matrix data, and unlabeled data.

Posted by Craig Johnston Sunday, June 17, 2018

Python Data Essentials - Numpy

Powerful N-dimensional array objects.

Python is one of The Most Popular Languages for Data Science, and because of this adoption by the data science community, we have libraries like NumPy, Pandas and Matplotlib. NumPy at it’s core provides a powerful N-dimensional array objects in which we can perform linear algebra, Pandas give us data structures and data analysis tools, similar to working with a specialized database or powerful spreadsheets and finally Matplotlib to generate plots, histograms, power spectra, bar charts, error charts and scatterplots to name a few.

Posted by Craig Johnston Saturday, June 16, 2018

Reverse Proxy in Golang

Retrofit security proxy to prevent XSS and code injection.

Reverse proxies are standard components in many web architectures, from Nginx in front of php-fpm serving Drupal or Wordpress, to endless mixtures of load balancers, security appliances, and popular firewall applications. Reverse proxies differ from forward proxies in little but their intended implementation, be it service-side or client side. The following information is useful in either context. However, I focus on a service-side architecture. Further down this article, I’ll be going over the reasonably simple go code needed to develop a basic, yet production quality proxy, but first I’ll give you my take on why they solve so many problems and offer up my little workhorse, n2proxy.

Posted by Craig Johnston Friday, June 15, 2018

Golang to Jupyter

Golang with Jupyter Notebooks

Jupyter Notbooks have been a popular technology in the Python data science community for a while now, especially in academics. Jupyter Notebooks are a way to mix inline, executable code with documentation in a presentation format. Best practices in organizing source code are not always the most efficient at communicating it’s functionality to a user.

Posted by Craig Johnston Sunday, June 10, 2018

Essential Python 3

Programming in Python

This article is a quick tour of basic Python 3 syntax, components and structure. I intend to balance a cheat sheet format with hello world style boilerplate. If you are already a software developer and need a quick refresh on Python then I hope you benefit from my notes below.

Posted by Craig Johnston Thursday, May 31, 2018