Soft skills

This month my professional career as a Software Engineer sums up to 10 years (although I’ve been programming longer than that). Here is a summary of what I think is really important for a good career.

Technical skills are important.
As you might have noticed, my blog is mostly technical.

But in this article I’d like to praise soft skills.

Being good with everybody, communicative, adaptable, helpful, truthful, modest, and being a good listener, observer, and learner is just some of the most important skills.

I would also like to quote our creed here at Automattic:

I will never stop learning. I won’t just work on things that are assigned to me. I know there’s no such thing as a status quo. I will build our business sustainably through passionate and loyal customers. I will never pass up an opportunity to help out a colleague, and I’ll remember the days before I knew everything. I am more motivated by impact than money, and I know that Open Source is one of the most powerful ideas of our generation. I will communicate as much as possible, because it’s the oxygen of a distributed company. I am in a marathon, not a sprint, and no matter how far away the goal is, the only way to get there is by putting one foot in front of another every day. Given time, there is no problem that’s insurmountable.

If you try to look 10 years back, you will mostly see memories of events and people and the time spent with them, not how you optimized that loop or that DB query (not that the latter is not important, thus the reason I said “mostly”).

Find inspiration and motivation in your successes, yourself, your family, nature, and other people. Maintain inner peace, and most of the soft skills will come naturally.

Predicting values with linear regression

I had some fun time reading http://onlinestatbook.com/2/regression/intro.html today. It includes formulas for calculating linear regression of a data set.

Linear regression is used for predicting a value of a variable from a list of known values.

For example if a and b are related variables, then linear regression can predict the value of the one given the value for the other.

Here’s an implementation in Racket:

#lang racket
(require plot)

(define (sum l) (apply + l))

(define (average l) (/ (sum l) (length l)))

(define (square x) (* x x))

(define (variance l)
  (let ((avg (average l)))
    (/
     (sum (map (lambda (x) (square (- x avg))) l))
     (- (length l) 1))))
(define (standard-deviation l) (sqrt (variance l)))

(define (correlation l)
  (letrec
      ((X (map car l))
       (Y (map cadr l))
       (avgX (average X))
       (avgY (average Y))
       (x (map (lambda (x) (- x avgX)) X))
       (y (map (lambda (y) (- y avgY)) Y))
       (xy (map (lambda (x) (apply * x)) (map list x y)))
       (x-squared (map square x))
       (y-squared (map square y)))
    (/ (sum xy) (sqrt (* (sum x-squared) (sum y-squared))))))

(define (linear-regression l)
  (letrec
      ((X (map car l))
       (Y (map cadr l))
       (avgX (average X))
       (avgY (average Y))
       (sX (standard-deviation X))
       (sY (standard-deviation Y))
       (r (correlation l))
       (b (* r (/ sY sX)))
       (A (- avgY (* b avgX))))
    (lambda (x) (+ (* x b) A))))

(define (plot-points-and-linear-regression the-points)
  (plot (list
         (points the-points #:color 'red)
         (function (linear-regression the-points) 0 10 #:label "y = linear-regression(x)"))))

So, for example if we call it with this data set:

(define the-points '(
                 ( 1.00 1.00 )
                 ( 2.00 2.00 )
                 ( 3.00 1.30 )
                 ( 4.00 3.75 )
                 ( 5.00 2.25 )))

(plot-points-and-linear-regression the-points)

This is the graph that we get:
untitled.png

Cool, right?

Correctness on iterative and recursive processes

Iterative processes are proven using loop invariants, and recursive processes are proven using induction. In some cases it might be trickier to find a good loop invariant, where proving recursive processes is just to follow the very own definitions of the process.


Consider the following recursive definition:

maxList [x] = x
maxList (x:xs) = max(x, maxList xs)

We can prove its correctness using induction:

– Base case: Max element of a list of size 1 is the element itself.

– Inductive step: Assume that maxList of xs is maximum element.

Then for maxList (x:xs) we have 2 cases:
1. maxList of xs is >= x, in which case we select maxList xs
2. x is >= maxList xs, in which case we select x
In either case, we pick the larger element which will be the maximum.


Now consider the following iterative definition:

var max = x[0], i;

for (i = 0; i < x.length; i++) {
     if (x[i] >= max) max = x[i];
}

In this case we need to find a loop invariant to use that will hold pre-, during, and post- processing of that code block.

We can use the following loop invariant: max is the biggest element in the subarray x(0, i).

– Before loop: for array of size 1 we have the same element to be maximum. So the loop invariant holds.

– Within the loop, we have two cases:
1. x[i] >= max, in which we set max to be x[i]
2. x[i] < max, in which we don't change max
In either case, the loop invariant holds.

– After loop: max is the biggest element in the subarray x(0, x.length – 1) which is just x.