Why Did The Arizona Cardinals Miss The 2013-2014 NFL Playoffs?

Because there are now 32 teams in the NFL, up from 28 in 1990 when the 16 game season started, while there are still only 12 playoff spots. 10 – 6 is not good enough anymore to almost guarantee a playoff spot.

Read the full analysis here.

facebooktwittergoogle_plusredditpinterestlinkedinmail

Related books picked – and if possible read – by me. Sponsored by Amazon Associates.

How To Turn A String Into An Integer In Hive SQL For Hadoop

You use the CAST function.


CAST(fieldname AS INT)

Let’s say you have a field called ‘subscription_ids’ in the ‘subscriber’ table that someone saved as a string but actually should be integers.

To change this to an integer, you would use the command:

SELECT CAST(subscription_ids AS INT) as subscription_ids_as_int

FROM Subscriber

Related books picked – and if possible read – by me. Sponsored by Amazon Associates.

Hilarious Meme About Being A Data Analyst

I found this What People Think I Do / What I Really Do meme about Data Analysts while doing a Google Image Search for Data Analysis. I promise that you’ll get a kick out of seeing it. The last three pictures are perfection.

Related books picked – and if possible read – by me. Sponsored by Amazon Associates.

How To Turn Your Excel Worksheet Into A Simple HTML Table

As I said in my NFL Playoff post, here is a brief description on how to convert an Excel worksheet into a simple HTML table.

What makes this tricky is that Excel tries to preserve all of the formatting from Excel in the HTML file, which adds an enormous amount of clutter that you do not need.

Here is what to do:

The Really Short Version

Save the worksheet as a “Single File Web Page”. Open it up in a text editor and erase everything above the <table> tag and below the </table> tag. Erase all of the attributes in the remaining tags.

The Longer And More Detailed Version

  1. Go to File, Click Save As.
  2. In the “Save as Type” drop box, choose the format “Single File Web Page (*.mht,*mhtml)”.
  3. Click Save.
  4. Click “Yes” in the pop up box to lose formating.
  5. Find the file where you saved it.
  6. Open it in a text editor like Notepad, though I highly recommend getting the free Notepad++
  7. Man there is a lot of code. Don’t worry. Just Search for the string “<table”. This is where the actual data from the worksheet is. It should contain a lot of <tr> and <td> tags. You should be able to spot your actual data inside the <td> and </td> tags.
  8. Erase all of the code before “<table”.
  9. Search for “</table”.
  10. Erase all of the code after “</table … > full tag.
  11. Erase all of the attributes inside all of the remaining tags such as “width=” and “style=”. A good short cut at this point is to a find and replace for what you are trying to erase and replace it with nothing since it is repeated in many of the tags.

That should do it.

Related books picked – and if possible read – by me. Sponsored by Amazon Associates.

Does A Record Of 10 – 6 Guarantee A Playoff Spot In The NFL?

One thing that I have noticed in recent years while following the National Football League is that teams with 10 wins and 6 losses miss the playoffs more than in the past. Am I just seeing things, or has have things slowly changed since 16 game seasons started in 1990? If I am not seeing things, what caused it? Parity? More teams? I decided to take a look. Here is what I found:

Year Total NFL Teams Teams 10-6 In Playoffs Not In Playoffs
1990 28 2 2 0
1991 28 3 2 1
1992 28 2 2 0
1993 28 3 3 0
1994 28 3 3 0
1995 30 3 3 0
1996 30 4 4 0
1997 30 3 * 3 0
1998 30 3 3 0
1999 30 2 2 0
2000 31 4 4 0
2001 31 3 3 0
2002 32 4 * 4 0
2003 32 6 5 1
2004 32 3 3 0
2005 32 3 2 1
2006 32 3 3 0
2007 32 5 4 1
2008 32 1 1 0
2009 32 3 3 0
2010 32 5 4 1
2011 32 3 3 0
2012 32 5 4 1
2013 32 2 1 1

* In 1997, The New York Giants were 10 – 5 – 1. In 2002, Pittsburgh was 10 – 5 – 1. I considered that close enough for 10-6 for these purposes.

As one can see, before 2003, only the 1991 season had a team with a 10 – 6 record that missed the playoffs. Then in 2003, 2005, 2007, 2010, and 2012, one 10 – 6 team missed the playoffs. This does appear to be a recent trend.

So what caused it? Parity or More Teams?

As far as parity is concerned, I would argue that parity started in the 1997 season, three years after the salary cap took effect. The AFC finally won a Superbowl that season after losing for 13 years straight. Superbowls that came after tended to flip more between the two conferences each year while far fewer were blowouts with one team crushing the other.

I suspect the reason for the recent increase in seasons with at least one 10 – 6 team missing the playoffs is the increase in the number of teams. In 1990, the NFL had 28 teams. By 2002 it had increased to 32. However, the number of playoff spots stayed fixed at 12.

Think of the playoffs as a pie that has exactly 12 slices. Some of those slices are bigger and tastier than others. A bye week with home field advantage throughout the playoffs is the biggest and tastiest slice of pie. The smallest and blandest is a wild card with no home field advantage. However, even as the league added more teams, the slices of playoff pie have stayed fixed at 12. 28 teams shared those 12 slices in 1990. Since 2002, 32 teams have fought for the same 12 slices of pie. Therefore, one now has to struggle more to get a slice of playoff pie, causing a 10 – 6 record to be less likely than it used too to get you a playoff spot.

Up Next

How to make a nice HTML table like the one above in Excel..

Related books picked – and if possible read – by me. Sponsored by Amazon Associates.

Five Great Blog Posts on Dealing with Hive for Hadoop SQL: Date and Time, Numeric, String, Conditional, and Collection Functions

I currently work using SQL for Hive for Hadoop. Hive SQL is a little eccentric. If you have worked in other forms of SQL, it takes some getting used to.

Below are five blog posts from folkstalk.com which have helped me immensely.

  1. Date and Time Functions
  2. Numeric functions
  3. String functions
  4. Conditional functions
  5. Collection functions

Why are these posts any good?

  1. Date and Time Functions – Dates and Times in Hadoop are stored as strings in the format ‘YYYY-MM-DD HH:MM:SS.T’ (T is tenth of a second), not a date/time datatype or an integer. It gives you a good sense of how to deal with this.
  2. Numeric functions – It’s pretty much the same as other forms of SQL. Given how nerve rattling Hive SQL can be at first, it’s reassuring to know that some things are still the same.
  3. String functions – Check out the CONCAT function. It’s an actual function in this case and not a symbol like ‘&’.
  4. Conditional functions – Check out the COALESCE function. This one gets used quite a bit when one is trying to clean data by combining multiple fields. All it does is return the first non NULL value from the fields listed. It’s surprisingly useful.
  5. Collection functions – Check out the CAST function. This is how you change a value from one type to another. For example, to change the field ‘beers_drunk’ from a string to an integer would be CAST(beers_drunk AS int)

HELP! Excel Dropped the Leading Zero from My New England 5 Digit ZIP Codes! – The New England Zip Code Problem, Part 1

Contents

Let me guess why you’re here:

Someone just gave you an Excel workbook where one of the columns is ZIP codes and many of them are 4 instead of 5 digits. The cities and states next to ZIP codes are all from New England,
and probably New Jersey and a few other places too
. You no longer have the original text file and are stuck with the workbook.

What’s the problem:

The problem is that Excel is not very good at preserving data types. A five digit ZIP code (which despite the postal service introducing nine digit ones with a dash between the 5th and 6th digit people still generally use) looks like an integer to Excel. Excel treats it as such and eliminates any zeros in the front. However, it is not an integer, it is a code. Specifically, it is a postal code, which you want to use to store as a text string. This isn’t a problem as long as your address is west of Philadelphia, but it becomes a real headache with New England, New Jersey, and Caribbean ZIP codes.

For example, the ZIP code for the neighborhood of Back Bay in Boston should be 02116. However, Excel assumes that it is an integer and gives you 2116.

Another way of thinking about the problem:

A good way of thinking about this is to look at the Postal Codes from other countries, which is what a ZIP Code really is. Here are a couple of examples what they look like in Canada and the United Kingdom, where they use alphanumeric postal codes.

Country Example Postal Code
Canada M5A 1N1
United Kingdom W11 2BQ

You would never store these codes as integers. You would store them as strings. Excel would not mistake them for integers, even if they started with zeros.

What’s the short term solution:

This solution assumes that all of your ZIP codes are in the original 5 digits, and not the new format of 5 digits + 4 digits.

The short term solution is fairly easy. Let’s say the zip codes are in column E. Insert a column to the right of the zip code column E. Next, in the first row of the new column F, type the following formula:

=IF(LEN(E2)<5,REPT("0",5-LEN(E2))&TEXT(E2,"#"),TEXT(E2,"#"))

Replicate the formula down the column.

Wow, that formula is hard to read:

In plain english, what the formula says is:
"If the length of the value in cell E2 is less than the 5 digits in a ZIP code, then concatenate the correct number of zeros to the front of the value of E2 and convert it to a text string. Else, just convert the value in E2 to a text string."

Let’s break it up into its components.

It is all nested in an Excel IF function.

=IF(condition, do if true, do if false)

The condition clause

LEN(E2) < 5 – is the ZIP code less than 5 digits.

The Do if true clause

REPT(“0”,5-LEN(E2)) - repeat the character "0" (NOT the number 0, the text string value "0"). You want to make this a text code, not an integer. You want 5 minus the length of the less than 5
digit numbers because this is the exact number of zeros missing.

&TEXT(E2,”#”) – convert from an integer to a string
and concatenate the 4 or less character ZIP code to the string of zeros you just generated

The Do if false clause

TEXT(E2,"#")) – you found a 5 character ZIP code, convert it to a string from digits.

That's it.

Up Next:
Let’s say you had the original text file. You could prevent this
That will be discussed in The New England ZIP Code Problem: Part II

Walkenbach & Mr. Excel: The Two Books to Get Stared in Excel VBA

If you want to get begin using Excel VBA, the best way to do so is to read these two books:

They will give you the foundation that you need to use Excel in powerful and just as importantly in reliable ways. If you choose to read them, make sure that you type out all of the examples in these books yourself in the Excel VBA IDE (what you see when type Alt-F11).

Excel VBA Programming For Dummies by John Walkenbach

This is the book with which to get started, especially if you do not know the first thing about programming. Walkenbach does an excellent job of holding your hand through the basic concepts that you need to know. As well, he sells a power utility pak that uses VBA to give Excel powerful and useful features. For extra money, you can see the code for it. I strongly recommend purchasing the pack and paying to see the code. You will see lots of examples of how to write practical applications for Excel in VBA.

VBA and Macros for Microsoft Excel Bill Jelen and Tracy Syrstad

WARNING: They have written similar books written since then, but previewing them on Amazon makes me think that they are missing key content. You need to buy the one written in 2004.

This book will take you to the next level. It is aimed at people who are more intermediate than beginner. It will show you just how powerful VBA in Excel can be. This is the book that allowed me to start getting my head around the VBA Object Model and writing VBA Classes.

While you should read the whole book, you should pay very close attention to the following chapters:

  • R1C1 Style Formulas
  • Names
  • Event Programming
  • User Forms
  • Automating Word
  • Event Programming
  • Creating Classes Records and Collections – this one is the most important
  • Handling Errors

I feel that these books are essential for getting started. You can buy them from the links below or at the top of the post:


How to Configure Contact Form 7 on WordPress So that You Can Get Messages without Getting Spammed

Having a contact form for your blog is essential for you to get feedback from your visitors without getting spammed. One plugin that does this is Contact Form 7.

Setting up Contact Form 7 was a bit trickier than I anticipated. I found this video on YouTube that did a great job of helping.

One thing I learned is always remember to scroll down while working in WordPress. There may be commands and settings that you are missing.

Please note, the author of the video has provided it in order to advertise his product. I have not used his product and have no opinion on his product.

Raw HTML Plugin – What to Use When You Want to Write HTML and Shortcode Examples in Your WordPress Post

In my last post, I used a lot of code examples in HTML, WordPress’ Shortcode, and the WordPress Video Plugin’s codes. If you have ever tried this before, you quickly realize that WordPress will immediately process the code as actual code that you want to implement, making the text that are your examples disappear, and actually implementing them. This is extremely frustrating.

For example, I just wanted to post the general format for the code to embed a youtube video useing the WordPress Video Plugin:

&#91;youtube id&#93;

Instead, I got this:

As you can see, WordPress immediately implemented the code.

How do you stop this?

Install the Raw HTML Plugin.

After you install the Raw HTML Plugin, if you want to have the code not get implemented, enclose that code in raw tags:

[raw] your code [/raw]

Now, there are at least two catches to this.

Catch Number 1: You cannot type brackets(such as [] or <>) inside the raw tags. Instead, you need to use HTML Entities (sometimes referred to as special characters – you can find a good list here).

To actually make the code above appear on the screen and not get implemented, for:

&#91;youtube id&#93;

I actually typed:

&amp;#91;youtube id&amp;#93;

Catch Number 2: Do not use the “Visual” tab when editing your post. If you do, all of the entities that you carefully typed will disappear.

Raw HTML will not take care of everything for you, but it does make things much easier.