Showing posts with label tips. Show all posts
Showing posts with label tips. Show all posts

Tuesday, 30 October 2012

Store and Report Unlimited Multiple Choices using Exponentiation

non-SQL developers look away now....

Much is in the air at the moment about digital tsunami, the Big Data movement and Business Intelligence technologies to name but a few. This post offers a concept that may reduce a tiny bit of the space along the way by showing how to store multiple choice answers in a single location.

One concept of storing option choices from forms, surveys and online exams for example is to store them combined as a single value. This removes the headache, sorry overhead, of dealing with dozens of boolean yes/no values that appear in a raw dataset and allows the Business Intelligence expert a more manageable view of what's on her plate. 


The method takes advantage of exponentiation where the values against option boxes are used as the exponent or base against an index value of 2.  
I said any number of choices but this is actually limited to a database server's own limitations or the  limits imposed by cloud based systems as part of their own governance rules.

Scenario:

Let's say we have a form on a web page that has 5 option boxes and a person completes the form and chooses the first, third and fifth choices. 

Each option box should have assigned values 1 to 5 respectively and based on the choices made (1,3 and 5) we derive the single value to store as follows : 

21 (option box 1) which is 2
+
23 (option box 3) which is 8
+
25 (option box 5) which is 32

So the single number to store that represents the choices 1,3 and 5 becomes 2 + 8 + 32 which is 42

We of course need to unravel the number 42 to display the choices back onto the page or app (p.s...no hint here to the Meaning of Life!). To do this we start by taking the highest valued choice box and test if it's associated 25 value can be subtracted from 42 and if it can then we mark it's equivalent check box. The remainder is then used against the next lower value option box and repeated until we reach the lowest value box. 

The steps below outline the full approach for this example.

1. Is 42 - 25 (25) greater or equal to 0?  
    42 - 25 = 10.
    Yes it is so we subtract it and use the remainder in the next step.

    We also mark the fifth choice option box as checked. 
  2. Is 10 - 24 (16) greater or equal to 0?
    10 - 16 = -6
    No it's not so we leave it and don't check option box 4. Move to the next step.

3. Is 10 - 23 (8) greater or equal to 0?
    10 - 8 = 2
    Yes it is so we subtract it and use the remainder in the next step.

    We also mark the third choice option box as checked. 

4. Is 2 - 22 (4) greater or equal to 0?
    2 - 4 = -2
    No it's not so we leave it and don't check option box 2.

5. Is 2 - 21 (2) greater or equal to 0?
    2 - 2 = 0
    Yes it is.

    We also mark the first choice option box as checked. 

This will now display option box 1,3 and 5 as checked.

Lastly, the single stored value will need reported on and for this you need to create a view which will be made up of a series of case statements one for each yes/no choice based on the logic above. 
For those using SQL Server then a Common Table Expression (CTE) would provide a more efficient alternative.

I have offered a concept here without any implementation code. This is because the problem can be solved using SQL, JavaScript, PHP, Ruby, C# or many other alternatives depending on the platform in use. 

I hope this helps :)

Enjoy !

Wednesday, 24 October 2012

Generating Unique Values Against a New Dataset

non-SQL developers look away now....

Here's another Data Analysis tip...

When dealing with brand new datasets a DBA Developer sometimes needs to add a unique value which will form a way of identifying each row. The ROW_NUMBER() function is an obvious choice but if the data contains duplicates or has data quality issues it can be tricky. This method doesn't rely on the data itself, it just adds the unique values you need.

First we add a column that will contain the incremental number......


-- add a new column called rownum
alter table test_data add rownum int

Now we run the following three line script......

-- declare a variable that be used to assign incremental numbers
declare @i int
--assign a starting point for our id's

select @i = 10000000
-- The update statement will populate the column 
-- with consecutive numbers starting with 10000001
update test_data set @i = rownum = @i +1

I used 10000000 as a starting point so that each number will be the same length and will appear consistent when used in a reporting scenario and when exported back out to tools such as Microsoft Excel. 

One to have in your SQL tool kit for sure.....Enjoy!