Eigenvalues

I’ve been making an effort to make more time for my blog lately, but it’s difficult in the midst of the school year. So, I decided that, in order to keep my posts relatively frequent, I’ll post some lighter articles in between the more detailed ones, more about ideas then the full intuition and mathmatical development of an algorithm. I’ll still work towards the larger, more developed articles as well, but there is no way I can keep them coming out weekly or so (neural networks is next!).

So in that spirit, I want to talk about eigenvalues and eigenvectors. They’re used a lot in machine learning, specifically in something called Principal Component Analysis, a data reduction method. I mentioned them briefely in a post I did a while back about Linear Algebra, but I left the math out. It’s time to bring it back up.

Note – If you feel you don’t have the basics to understand this article, read my intro to linear algebra article!

Intuition

If we consider a vector being multiplied by a matrix, it does some geometrical transformation. Maybe it shifts it rotates it by 30 degrees, or changes it’s dimensions. We can look at this mathmatically as Ax = b, where x is the vector we started with, and b is the vector we now have. (x is blue, b is red).

Screen Shot 2015-11-22 at 10.08.10 PM.jpg

 

There is a special case of this procedure, for certain vectors x, where Ax yields simply a stretched or shrunk version of x. Geometrically, we can picture it like this.

Screen Shot 2015-11-22 at 10.16.04 PM.jpg

In this case, x is called an eigenvector.

Math

Now, how can we represent this mathmatically? The vector b is simply x, but scaled.

latex-image-7.png

latex-image-8.png

Note – x must be nonzero.

Here we call x an eigenvector of the matrix A, and λ an eigenvalue of A. Now that we have a basic understanding, we have to answer a harder question. How do we find the eigenvalues and eigenvectors of A?

Let’s start with the eigenvalues. This gets a bit math heavy.

 

Finding the eigenvalues

no1.png

We can start with the above operations to make things clearer.

Step 1 – We have this from the definition of an eigenvalue and eigenvector.

Step 2 isn’t too crazy, all we do is introduce an identity matrix to multiply x. This is fair because x = Ix, where I is the identity matrix.

Step 3 – we subtract the term on the right to the left side, leaving the right equal to 0.

Step 4 – We factor out an x. Ok, get ready for a bunch of math.

Here is the most difficult part of this article. This whole argument is based on the invertible matrix theorem. We have an expression to work with, equation 4. We know x is a non-zero vector, so (A-λI)x = 0 has a non-trivial solution, so the matrix (A-λI) is not invertible.  Now, because the matrix (A-λI) is not invertible, that means it’s determinant is 0! This is how we will solve for the eigenvalues. The equation below is called the characteristic equation.

latex-image-4.png

Let’s do a simple 2×2 example to show that the process isn’t as hard as it sounds.

Example

latex-image-5.png

It’s extremely helpful to write out the matrix (A-λI). 

latex-image-1.png

And recall that the determinant of a 2×2 matrix is

latex-image-2.png

So the determinant of (A-λI) is

latex-image-3.png

Therefore, the eigenvalues of A are λ = {3, -7}. If you’ve gotten this far, props to you, you math loving nerd :).

I wanted to keep this light, so I think I’ll stop here. I’ll follow this up with a quick post on eigenvectors. Please leave me some feedback, especially on what parts were confusing for you! I really would like to help make some of these math concepts more well known within fields heavily influenced by computer science.

Also, some of you suggested a few ways to add code into my articles more smoothly. I want to let you guys know that I’m working on it. I don’t self-host my site right now, it’s through WordPress, and so I don’t have access to the PHP. But I’m going to switch soon, so the code will be nicer.

Thanks for reading!!!

Advertisements

6 comments

  1. Very nice explanation. Especially the plots are helpful to understand what happens when vectors are multiplied with matrices. I think that’s a difficulty for many people when they start with linear algebra 🙂

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s