Skip to content

The problem with personal finance blogs

I love me some personal finance.

I know, some think it's boring, but I love thinking about personal finance, the stock market, and all the ins and outs of the personal investing. I like researching stock picks, thinking about the market, and what I can do to further my own finances. In fact, I've toyed with the idea of getting a MS in finance -- not because it would further my career, but strictly because I like the topic.

During my nightly web surfing, I've stumbled across a number of personal finance blogs, and it seems that all these bloggers follow the same path:

  1. I'm in Debt!!!11
  2. I follow a personal finance guru:
    1. Suze Orman
    2. Dave Ramsey
    3. Robert Kiyosaki
    4. All of the above, and/or others
  3. I'm out of debt!
  4. Hmm, you know I would like to make a living blogging...
  5. Hey, I'm actually making a couple hundred a month blogging, can I quit my job?
  6. ...No...
  7. I know, I'll start advertising spammy things so I can quit my job!

And it degenerates from there with the blogger talking about amazon "friday sales", credit card reward points programs, Quiken coupons, and free credit reports. No longer are they talking about financially sound decisions and helping their readership. It's very disappointing. People who are in a position to actually offer advice and encouragement chose to whore out their readers eyeballs for a quick buck.

In some sense, I guess I'm realizing that personal finance blogs don't offer much advice beyond that of the national gurus, and that I really don't need that advice. Despite the fact that I love the turnaround story of people pulling themselves out of debt, I guess I'll need to clear my blogroll again, and find some new blogs to read in the evenings.

Reject the Treasury’s proposed bailout

I know the following will be ignored. I've got to try though...

Senator:

I am writing to tell you my feelings regarding the Treasury's recent proposal to buy up to 700 billion dollars of assets from America's financial institutions. I feel this legislation would be a disaster for the American people, and I will outline my reasons below.

I understand that America's financial institutions are on shaky ground. Unlike most Americans, I have followed the growing crisis in the financial sector for over two years. I have seen this storm brewing. This problem has been on the balance sheets of America's firms since the housing bubble burst in 2006, and obvious to anyone who follows the market.

This problem is not new. It is not unseen. It has been coming for years, and known to the very financial institutions who now seek a handout. I am no economist. I am no ex-chief of Goldman Sachs. But, I can add, and I knew it was coming.

This proposal will cost 2500 dollars for every man, woman and child in the United States. That money could have been spent on alternative energy research, or health insurance for the uninsured, or the wars, or any number of other federal programs. Instead we are rushing to bail out banks who have seen this coming for years.

We are taking no time to reflect on the problems' causes. We are taking no time to prevent similar bailouts in the future. We are rushing to give the Treasury carte blanche to spend without any oversight. We are letting the foxes watch the henhouse, and we're giving them a nice chicken dinner before they begin their shift.

I urge you, Senator, do not spend that money without some thought. Reject this legislation until we have time to develop the correct solution.

I have faith that you have the strength to stand up and represent the people of this great state of Delaware.

Regards,
Jeremy Faller

Sent to Senator's Biden and Carper.

Noogler

Since I only post when I feel like I have something to say, the fact that it's been over 4 months since my last set of posts might make people think I haven't had a lot to say recently. That would be incorrect. I have had a lot to say, but little time to say it.

After much hemming and hawing, my wife and I decided to take a job offer from Google, and move to the Boston area. I start on Monday, Sept 15.

It wasn't an easy decision, but I think it's the right one.

First of all, I am excited as anything to go work for Google. They are an excellent company, a great place to work, and I think I'll have something I haven't had in a long time at a job -- challenge. But it was an extremely difficult decision for a number of reasons. I won't get into the reasons here. My close friends and family already understand.

In the meantime, I am not sure what direction this blog will move. I have been semi tinkering on some OSS projects since the Google option came up, and I'll have to see if that continues when I get to Google. Additionally, I'm not talking much about what I'll be doing when I get to Google, but I think it will be a departure from what I've done in the past.

Wish me luck.

When Ants Come to Your Picnic

Since I posted my "Sling Blade Runner" answer, I've been in discussion with Benjamin Meyer about the puzzle, my solution, and other general topics. During this discussion, he shared his solution with me, and as a collaboration, we found a set of movie titles that is 326 movies long. The original list of 318 movies was generated by my program, but was expanded to 326 by his. (My algorithm is good at finding long lists, his algorithm is good at optimizing long lists.)

I've decided that since I've shared much of the algorithm with him (not my code), that I would publish my answer here as well.

The algorithm I used is called Ant Colony Optimization (ACO). ACO is a randomized algorithm where virtual ants walk a graph of the problem space developing answers. Once a number of ants have walked the problem, they lay a "pheromone" on the "paths" that were particularly good. The algorithm then has more virtual ants walk the problem space, but this time they "slightly favor" paths that have pheromone on them. Once the second round of ants finishes, all pheromones on the problem are faded, and the best ants from the second run lay their pheromone. A simple state machine describes ACO's operation:

  1. Allow N ants to randomly walk your problem. The probability that an ant will take a certain turn in the path is a combination of an apriori probability and a weighting probability from the pheromones laid on the particular edge.
  2. Fade all pheromones on the edges.
  3. For the M best ants, allow them to lay the pheromones.
  4. Goto (1) until you think you've got a good path.

ACO is devilishly simple in its implementation, develops surprisingly good answers to this problem, and is highly threadable. But, ACO does come with its own set of drawbacks -- in particular there are a number of "magically tuned" parameters that affect its operation:

  1. What is a good number of M and N?
  2. What is appropriate amount of pheromone for "good" ants to lay?
  3. What is a good way to combine the apriori and pheromone probabilities?
  4. What is a good apriori probability?
  5. And finally, How much do you fade all the different pheromones between steps?

Unfortunately, ACO doesn't give us much help with many of the above problems. That didn't phase me, I just charged ahead and threw some ants at my problem. :)

In my source directory, you will find a copy of my source code that implements Ant Colony Optimization of Sling Blade Runner. There are a number of different comments I'd like to make:

  1. Unlike Ben's clever solution, I didn't bother to thread it. It would be easy enough to do, I didn't bother.
  2. There are a number of optimizations in this code. Namely, I stripped out ACO's call to the pow function, and made them lookup tables, and rejiggered things to minimize copying things to and from the stack.
  3. There are a number of optimizations I didn't make. Namely, I call new (through push_back/push_front) in my main loop. That's just plain silly, but I didn't bother to strip it out.
  4. My magical parameters were tuned by trial and error. Most of them are #defines, and if you read the ACO literature, you can understand my nomenclature.
  5. The code isn't the most readable. I coded it quickly, and have tweaked it. It's only a personal project, so don't hate me because it ain't beautiful. :)

Having implemented ACO, I can say that it's an interesting idea, and one I'm happy to have in my arsenal. It codes up quickly, and I have already used it to solve a couple of other problems of varying complexity.

Edited to add:
You'll need the movies data to run my program.

Sling Blade Runner

Lately, I've been without a project to keep me busy at nights (hence all my blog posts). I'm learning python, but since my main gig is C/C++, I like to keep those skills sharp with little puzzles. While looking for interesting coding puzzles, I stumbled across a page at ITA Software that asks you to find the longest chain of movies you can. For example, "Sling Blade Runner" are two titles strung together. Seemed simple enough at first blush, but once you study the problem some, you realize it's much harder than originally thought.

So, I sat down and started hacking. Since the puzzle is a recruiting tool, I won't spoil it by giving my code, but I came up with a solution that seems to be the best one published on the web at this time.

My code averages around 270 per run, and runs take about 5 minutes. My solution is heuristic, and is guaranteed to find the longest answer, but you might have to wait till the sun swallows your computer before it finds it.

Just to throw down the gauntlet, I submit my longest movie string found (during a standard 5 minute run) -- 304 movies.

Editted to add:
It's surprising how often this movie ends up in a list. (Props to Kevin James Barker!)

Re-editted to add:
Well, It looks like I'm not King of the Hill any longer (see the comments). I guess I should get back to work on this puzzle.

Beautiful Code

Recently, my wife and I went to the local Borders, and I saw O'Reilly's Beautiful Code. Since the book has sat in my Amazon queue for months, I was excited to get a chance to peruse it. So, I gleefully sat down, cracked the cover, and proceeded to read a number of chapters.

I'll save you some time and just tell you that the book was supremely disappointing. With the exception of the first chapter, I felt the book was entirely "missable". In fact, I would venture to say that the only piece of truly "Beautiful Code" I found in the whole book would be Brian Kernighan's (with help from Rob Pike) regular expression engine. To me, it is an inspired piece of code. It shows a simple regular expression engine in 30 lines of C code. It shows the beauty and power of C with a practical and elegant solution to a real world problem. Wow.

Now, I admit, not everyone is Brian Kernighan or Rob Pike, but I felt let down by Beautiful Code. I felt most of the submissions were just pieces of code that the authors yanked from their projects and decided to talk about. None of them were as generic, widely applicable, or beautiful as that by Kernighan.

I might end up buying Beautiful Code anyway since it's for a good cause. I doubt I'd ever re-read it though.

Safe Mutex Management

Recently, I've been writing a fair amount of threaded C++ POSIX code, and (as I often do), I decided to wrap the POSIX mutexes in a convenient little wrapper class:

 
class Mutex
{
    pthread_mutex_t mMutex;
public:
    Mutex() { pthread_mutex_init(&mMutex, NULL); }
    ~Mutex() { pthread_mutex_destroy(&mMutex); }
 
    void Lock() { pthread_mutex_lock(&mMutex); }
    void Unlock() { pthread_mutex_unlock(&mMutex); }
    bool TryLock() { return (pthread_mutex_trylock(&mMutex) == 0)?(true):(false); }
};
 

I know others' opinions differ, but I personally like wrappers like this. They allow me a nice way to make sure that all my Mutexes are friendly C++ objects, play well as member variables, etc.

While writing some code, I caught myself writing a number of functions like the following:

 
void MyClass::MyFunc()
{
    mMutex.Lock();
    if (SomeCondition())
    {
        if(SomeOtherCondition())
        {
            // etc...
        }
    }
    mMutex.Unlock();
}
 

Now, whenever I write code like the above function, it makes my skin crawl. First of all, I hate writing nested if statements. Secondly, and more importantly, the above code is exception unsafe in that if any of the code before the mMutex.Unlock() throws an exception, the mutex won't be unlocked correctly.

First I thought that since I don't care about the exception in MyFunc(), I could get away with:

 
void MyClass::MyFunc()
{
    mMutex.Lock();
    try
    {
        // etc...
        mMutex.Unlock();
    }
    catch(...)
    {
        mMutex.Unlock();
        throw;
    }
}
 

But, I hate the multiple calls to unlock the mutex, the unneeded try/catch block, and just everything about this code.

After some thinking, I came to a conclusion, that in hindsight must be obvious to most C++ programmers out there:

 
class SafeLock
{
    Mutex &mMutex;
public:
    SafeLock(Mutex &inMutex) : mMutex(inMutex) { mMutex.Lock(); }
    ~SafeLock() { mMutex.Unlock(); }
};
 

Now all I need to do is the following:

 
void MyClass::MyFunc()
{
    SafeLock lock(mMutex);
    if(!SomeCondition()) return;
    if(!SomeOtherCondition()) return;
    // etc...
}
 

Additionally, I have found that SafeLock makes it easier on me to call functions that require a mutex to be held before calling it. I just pass references to the SafeLock to all functions that require the mutex held:

 
void MyFuncThatRequiresMutex(const SafeLock &lock)
{
    (void)lock;
    // etc...
}
 

Torch Singer

These things crawl across my floor, I can't use them anymore.
There's a heaven in her band, Hallelujahs in my hand.
All my patience, love's inside -- she just climbed the stairs and died.
Lights that rose, and fell again, songs that thinned out near the end.

Her voice trailed off in the end.

Though your miles are more than mine -- the things I've taken in a bind.
It's for certain, it's for sure... I've no use for them no more.
Making room within one's self, for another's song to help
And it all comes back to me, as I walk hungover down the street.

And it all comes back to me.

And it all comes back to me, as I walk hungover down the street.
She's a mother in disguise, I look different in her skies.
But, it's morning, so I say, "It's a big red letter day."
Her skin's whitewash like skim milk, her words sing softly just like silk.

There's some things I've got to say, she won't understand anyway.
There are miles between our hearts, there's salvation in false starts.

I'm forsaken, in the end.


I'm sorry I missed them.

Escaping the MTRX

Embedded developers have it harder than most. Often we don't have library options most other developers do. For example, the library might depend on file I/O, or it might require easy access to a large heap, or it even might require floating point arithmetic that our part is no good at implementing in software.

Overtime we learn to adapt and overcome. We develop a set of libraries which performs most of the functionality we need. We find good articles in different embedded magazines that have code we can rip off. And we even sometimes suck it up, and buy better parts or operating systems to make development easier.

Over the recent months, I've been re-tweaking a simple set of C matrix functions I've used before. The library is specifically designed for embedded developers. What I mean by that is I tried to follow a number of goals:

  • The library should be optimized for readability/portability.
  • Don't try to be all things to all people, provide the basic functionality people will require for one and two dimensional matrices.
  • Don't require a heap.

The following is my MTRX library:
mtrx.h
mtrx.c

The library provides the basic matrix functionality like:

  • Scalar addition, subtraction, and multiplication
  • Multiplication
  • Inversion
  • In-place and out of place transposition

I haven't extended it to include eigenvalues and vectors. Additionally, I haven't added different matrix decompositions that others might need. Again, the library is a barebones matrix implementation that most embedded developers can use with a little tweaking.

Here's an example of using the library:

 
#include "mtrx.h"
#include <stdlib.h>
#include <time.h>
 
#define N (100)
 
int main()
{
	unsigned i,j;
	srand(time(0));
	for(i = 0; i < 1000; ++i)
	{
		MTRX_t			m1, m2, m3;
		MTRX_Element_t	f[N * N];
		for(j = 0; j < N * N; ++j)
			f[j] = ((float)(rand())) / ((float)RAND_MAX);
		MTRX_Create(&m1, N, N, f); // will use the stack's f[] for storage
		MTRX_Create(&m2, N, N, 0); // uses the heap
		MTRX_Create(&m3, N, N, 0); // uses the heap
		MTRX_Copy(&m2, &m1);
		if (MTRX_Invert(&m2, 0)) // will use heap for temp storage
			MTRX_Multiply(&m3, &m1, &m2);
		MTRX_Destroy(&m1);
		MTRX_Destroy(&m2);
		MTRX_Destroy(&m3);
	}
	return 0;
}
 

The library is pretty smart about using the allocations. In other words, if you create matrices, and give them enough size, they should never be allocated on the heap; however, if built right, the library will yank space from the heap when needed. To turn off heap allocation, build with MTRX_NO_MALLOC_.

Additionally, the return values from most functions are bool's. This means two things, most functions can fail (generally if the matrices aren't big enough), and you need C9X to build it. Also, a few minor notes:

  • Don't use the inplace transpose function if you can help it. It's quite expensive.
  • The inversion routine requires a temporary buffer. Read the header file for size requirements. (If no buffer is passed, and the library is compiled with heap support, one is allocated.)

As usual this library is MIT licensed, but I'd appreciate it if people would drop me a line if they use it.

A line in the sand

The other day, someone (who shall remain nameless to protect their identity) asked me the following:

I have a set of data points, (X,Y), and I know they are related through a simple function:

Linear Equation

How do I determine m, and b?

Now, there are many ways to determine values for m and b; but, right now I'd like to discuss Least Sqaures fitting.

The original idea for least squares fitting came from Carl Gauss, and it stems from the idea that we'd like to minimize a measurement of the error in our result. In other words, we'd like to find m and b such that error is minimized, or as close to 0 as we can get:

Error Equation

Now, it turns out, minimizing the above function doesn't work really well. In particular, we could pick a and b such that the error is Error Equation, and that wouldn't leave us with a good value for m or b. We, therefore, choose to minimize the square of the error function:

Error Equation

Now, as we all remember from high school calculus (you took calculus in high school didn't you), we can minimize the error function, by taking the first derivative and set it equal to zero. Since, we're trying to minimize with respect to m and b, we'll have to differentiate with respect to both variables:

Partial b

Partial m

Now, if we rejigger the above formulas, we can find:

Simplified Partial b

Simplified Partial m

At this point, we could solve for m and b and be done with the whole thing, but there is a simplification we can make. If we first define the following:

x variance

y variance

yx covariance

Then through the power of algebra, we can simplify the above equations and solve for m and b. I'll leave the exercise to the reader, but the solution then ends up:

At this point, you might be saying, "That's all well and good Jeremy, but show me the code." So, without further ado, here's the code that can perform least squares linear regression in C:

 
////////////////////////////////////////
// Given vectors x and y, finds m and b via least squares
//  y = m*x + b
void
LinearRegression(unsigned N, float *x, float *y, float *outM, float *outB)
{
    unsigned i;
    float    xSum, ySum, ss_xx, ss_xy;
    xSum = ySum = ss_xx = ss_xy = 0;
    for(i = 0; i < N; ++i)
    {
        xSum += x[i];
        ySum += y[i];
        ss_xx += x[i] * x[i];
        ss_xy += x[i] * y[i];
    }
    ss_xx -= xSum * xSum / N;
    ss_xy -= xSum * ySum / N;
    *outM = ss_xy / ss_xx;
    *outB = (ySum - *outM * xSum) / N;
}
 

You'll notice that I juggled around the calculations of averages of x and y. Depending on data set size and precision, this might introduce some bugs. You might want to use the less optimal, but more correct:

 
////////////////////////////////////////
// Given vectors x and y, finds m and b via least squares
//  y = m*x + b
void
LinearRegression(unsigned N, float *x, float *y, float *outM, float *outB)
{
    unsigned i;
    float    xMean, yMean, ss_xx, ss_xy;
    xMean = yMean = ss_xx = ss_xy = 0;
    for(i = 0; i < N; ++i)
    {
        xMean += x[i] / N;
        yMean += y[i] / N;
        ss_xx += x[i] * x[i];
        ss_xy += x[i] * y[i];
    }
    ss_xx -= N * xMean * xMean;
    ss_xy -= N * xMean * yMean;
    *outM = ss_xy / ss_xx;
    *outB = yMean - *outM * xMean;
}
 



I'm just noticing that it's been forever since I posted. I apologize for that. I'll try to get neat stuff to say more often.