I love to create data cubes.  One of the most intense aspects of data cubes is the data collection, which I’ll leave for a future tip.  Data organizing and formatting for analysis / reporting closely follows data collection intensity for me.  So for you number jockeys, today’s tip is the formula SUMPRODUCT and why it is so valuable.

Technically, SUMPRODUCT, allows you to work with arrays and perform vector math.  You can go to Excel / Calc help and look up the formula name and get the basic limited view of how to use the formula.  I am going to give a practical example that has allowed me to simplify and automate the data organizing and analysis.

My original problem was attempting to count how many times a specific team won.  In my data set, a team can reside in both the Home and Away columns depending on where they were playing that week.  So I was trying to count values only if it was associated with a specific team in a specific column.  I tried to embed a second set of criteria within the COUNTIF() function only to get errors from those resulting formulas.

As a result of my findings, we will examine three functions; COUNTIF(), SUM(), AND SUMIF(), and how they can be replaced by SUMPRODUCT which can have as many criteria you can come up with, in other words that wonderful concept of infinity.

In my first example, I want to count how many times the Home team won in week #1.  I have two sheets titled ‘2007′ (holds the data) and ‘2007 Home Team Analysis’ (the sheet where I organize the data for analysis).  That produces the following formula =COUNTIF(’2007′.T2:T17;”H”).  Now you ask, what do I do for week #2.  Well, you would copy the formula down to the next cell and change lookup field in column T resulting in the following formula =COUNTIF(’2007′.T18:T33;”H”).  Ok, not so bad until you take into account weeks 3-10 vary from 13 to 15 games per week due to the bye’s.  Having to go in and manually update lookup fields gives me a head ache.  I don’t know about you, but I love to find ways to keep me from getting a head ache.

Here’s the solution to this problem:  On your analysis sheet (’2007 Home Team Analysis’) you have a column to identify the criteria.  In this case I have a column (week #) that appears in both sheets, so instead of having to create 17 different lookup fields I now have only 1 lookup field that contains 256 rows for the entire NFL football season.  Therefore, the new way to count home team wins is

=SUMPRODUCT((’2007′.$B$2:$B$257=$A3)*(’2007′.$T$2:$T$256=”H”))

Now for the cool part, the explanation of how this reduces the data cube to the result.  The data starts with a column of 256 values (Column B on ‘2007′) and is reduced by matching only the values based on the criteria cell (Cell A3 in ‘2007 Home Team Analysis’).    So using week 1 as an example, the array is reduced from 256 values to 16.  Now continue to pare the data using the second criteria home team wins (Column T of of ‘2007′) .  Now overlay that 16 cell template over values in Column T and the result is ten home team wins in week 1 of 2007.

Now, with not as much verbiage, here are some more example to ponder:

How many points did the home teams score in week 1:

=SUM(’2007′.R2:R17) is replaced by =SUMPRODUCT((’2007′.$B$2:$B$257=$A3)*(’2007′.$R$2:$R$257))

One step further, How many points did the home teams score if they won in week 1:

=SUMIF(’2007′.T34:T49;”H”;’2007′.R34:R49) is replaced by =SUMPRODUCT((’2007′.$B$2:$B$257=$A3)*(’2007′.$T$2:$T$257=”H”)*(’2007′.$R$2:$R$257))

In conclusion, SUMPRODUCT() allows you take some basic functions and expand them to solutions only limited by your own creativity.  As for my original problem stated above, here’s the solution:

=SUMPRODUCT((’2007′.$E$2:$E$257=$C3)*(’2007′.$T$2:$T$257=”V”))+SUMPRODUCT((’2007′.$L$2:$L$257=$C3)*(’2007′.$T$2:$T$257=”H”))

The first SUMPRODUCT () looks in the road team column (E) for team (C3) and counts how many times they won on the road noted by the “V” criteriain column (T).  The second SUMPRODUCT() looks in the home team column (L) for the same team (C3) and counts hom many times they won at home noted by the “H” criterian column (T).

Comments No Comments »

An old addage in the NFL that playing at home is worth 3pt.’s.  And, wouldn’t you know, here come’s week 1 in the 2008 season:

Home Team Record
Wk # W L T % PF PA Diff.
1 10 6 0 .625 347 294 53
21.69 18.38 3.31

There’s the 3pt home field advantage (to be exact, 3.31 pt.’s). But, as in life, there’s a big difference between winners and losers. Let’s take a look:

Home Team Record Home Team in Wins Home Team in Losses
Wk # W L T % PF PA Diff. PF PA Diff. PF PA Diff.
1 10 6 0 .625 347 294 53 259 127 132 88 167 -79
21.69 18.38 3.31 25.9 12.7 13.2 14.67 27.83 -13.17

WOW!  Home team winners won by an average of 13.2, while home team losers lost by an average of 13.17.  Only one game (Panthers v. Chargers) was settled by less than 3 pts, while half the games were double digit differences.

You say that’s only one week, well StatBandit has more data for you!

Home Team Record Home Team in Wins Home Team in Losses
Wk # W L T % PF PA Diff. PF PA Diff. PF PA Diff.
1 10 6 0 .625 347 294 53 259 127 132 88 167 -79
21.69 18.38 3.31 25.9 12.7 13.2 14.67 27.83 -13.17
2 6 9 0 .400 314 371 -57 184 135 49 130 236 -106
20.93 24.73 -3.8 30.67 22.5 8.17 14.44 26.22 -11.78
3 12 4 0 .750 430 317 113 356 202 154 74 115 -41
26.88 19.81 7.06 29.67 16.83 12.83 18.5 28.75 -10.25

and the cumulative data:

Running Totals:
Home Team Record Home Team in Wins Home Team in Losses
Wk # W L T
PF PA Diff. PF PA Diff. PF PA Diff.
2 16 15 0 .516 661 665 -4 443 262 181 218 403 -185
21.32 21.45 -0.13 27.69 16.38 11.31 14.53 26.87 -12.33
3 28 19 0 .596 1091 982 109 799 464 335 292 518 -226
23.21 20.89 2.32 28.54 16.57 11.96 15.37 27.26 -11.89

Some brief comments, week 2 was a terrible week for the home team while the home team completely rebounded in week 3.  But the trend of large margins in the games continue.

PS.  StatBandit utilized OpenOffice Calc to organize and format the data.  OpenOffice Calc is the open source version of Microsoft Excel.  Be sure to check back for discussions on the pro’s and con’s of Calc compared to Excel.

Comments No Comments »

For all Word Press administrators: if your looking for an alternative to users loging into the account dashboard and then having to find the button “visit site” to get to the home page, then here’s a solution.

You will need to edit the file wp-login.php.

Find the line: <input type=”hidden” name=”redirect_to” value=”<?php echo attribute_escape($redirect_to); ?>” />

Replace <?php echo attribute_escape($redirect_to); ?>  with your site’s url

Therefore line should read: <input type=”hidden” name=”redirect_to” value=”http://your site’s url” />

Comments No Comments »