Dataphor SQL RAC (Relational Application Companion)


A site of hope for those looking for a true relational database system

Saturday, December 20, 2008

Extracting numbers from a string

This example is based on the question asked in the thread:

microsoft.public.sqlserver.programming
Dec 18, 2008
Pulling a number out of an nvarchar field
http://tinyurl.com/3quyap

The OP was interested in pulling out the 1st occurrence of a number in a
string. The string has numbers and letters. So if the string is 'XY40A3' 
we want the number 40 which is the 1st of two numbers in the string.

This is very easy to do in Dataphor. Just as a table of numbers comes in
handy for solving many different problems so does a list of letters. 
The idea here is to treat each letter in the string as a 'delimiter'. We
then split the string using the list of letters as the delimiters so
what results are the number(s) in the string.

We can store the list (the letters of the alphabet) in a temporary table
for easy access.

create session table AlphaTable
 { AlphaList:list(String),
   key{ }
 };
AlphaTable:=
       table
            {
             row
             { {'A','B','C','D','E','F','G','H','I','J','K','L','M','N',
                'O','P','Q','R','S','T','U','V','W','X','Y','Z'} AlphaList}
             };      

For example if we split a string, transform it to a table  and remove any
blanks we'll just have numbers left. If we order by sequence (Index) it will
show the numbers as they occur from left to right. 

select ToTable('XY40A3'.Split(AlphaTable[].AlphaList),'StrNum','Index' )
                where StrNum>' ' 
                  order by {Index};                  

StrNum Index 
------ ----- 
40     2     
3      3     
           
Not only would it be easy to get the 1st number but we can get the
occurrence of any number easily. The 1st occurrence is just a special
case of the general problem of getting the Nth occurrence in a string.

By using ToTable(ToList(cursor by Index (that follows the order of
numbers from left to right in the string) we can create a consecutive rank
from 1 to N over the table of numbers that will allow direct access to the 
Nth number (if it exists).
       
select
 ToTable
        (  
         ToList
               (
                cursor
                      (
                        ( 
                         ToTable('XY40A3RAD853'.Split(AlphaTable[].AlphaList),'StrNum','Index' )
                         where StrNum>' ' 
                         {ToInteger(StrNum) Num,Index}
                        )
                         order by {Index}
                      )   
               )   
        )
         {Num,Index,sequence+1 NthIndex} ;
         
Num Index NthIndex 
--- ----- -------- 
3   3     2        
40  2     1        
853 6     3               
        
Here is the an operator for the Nth occurrence that takes into account
lower case letters and returns a -1 if the Nth occurrence doesn't exist.
        
create session operator NthNum(AStr:String,N:Integer):Integer
begin
var T1:= 
  ToTable
        (  
         ToList
               (
                cursor
                      (
                        ( 
                         ToTable(Upper(AStr).Split(AlphaTable[].AlphaList),'StrNum','Index' )
                         where StrNum>' ' 
                         {ToInteger(StrNum) Num,Index}
                        )
                         order by {Index}
                      )   
               )   
        )
         {Num,Index,sequence+1 NthIndex};
result:=IfNil((T1 adorn{key{NthIndex}})[N].Num,-1);
end;

select NthNum('SF346fs47sGs759 ',1); //returns 346
select NthNum('SF346fs47sGs759 ',2); //returns 37
select NthNum('SF346fs47sGs759 ',3); //returns 759  
select NthNum('SF346fs47sGs759 ',4); //returns -1

Here a table of strings in stored in an Sql Server database from which
we can extract the 1st occurrence of a number.

create table FooStrings
{
 keycol:Integer,
 datacol:String nil,
 key{keycol}
};
FooStrings:=
 table
  {
 row{1 keycol, 'XYZ40AB' datacol},
 row{2, 'WX32A'},
 row{3, '27 blah'},
 row{4, 'A87BNC30'},
 row{5, 'XY40A3'},
 row{6, 'TWFD'},
 row{7, 'XYA53GH5JGV934'},
 row{8, '7'},
 row{9, nil}
  };
  
select FooStrings add{NthNum(IfNil(datacol,' '),1) MyNumber}
                   with {IgnoreUnsupported = 'true'}
                    order by {keycol};
                      
keycol datacol        MyNumber 
------ -------------- -------- 
1      XYZ40AB        40       
2      WX32A          32       
3      27 blah        27       
4      A87BNC30       87       
5      XY40A3         40       
6      TWFD           -1       
7      XYA53GH5JGV934 53       
8      7              7        
9      <No Value>     -1   


Think how easy the operator can be modified if it's desired to return
the minimun or maximum number etc.

If you would like to see a coherent and easy to follow all t-sql solution
that doesn't cobble together string functions into spaghetti code see:
RAC - Are you Coordinated?

But I think you'll agree the much preferred solution is in Dataphor 

Saturday, December 06, 2008

Sql server dynamic crosstabs by Jeff Moden

This is the RAC version of Jeff Modens fine article on dynamic crosstabs
for sql server at:

Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs 2008/12/03 
www.sqlservercentral.com/articles/cross+tab/65048/

The data is generated by the method outlined in the article. The table was 
populated with 1 millions rows

- Basic crosstab. 2 secs in QA for S2005.
Exec Rac
@transform='Sum(SomeMoney) as SumMony',
@rows='Someletters2',
@pvtcol='(left(datename(mm,DATEADD(mm,DATEDIFF(mm,0,SomeDate),0)),3)+~ ~+
          cast(year(DATEADD(mm,DATEDIFF(mm,0,SomeDate),0)) as char(4))) as mthyr',
@pvtsort='month(DATEADD(mm,DATEDIFF(mm,0,SomeDate),0))', -- Sort pivot expression by an integer. 
@from='##JBMTest',
@WHERE='SomeDate>=~Jan  1 2008 12:00AM~ AND SomeDate<~Jul  1 2008 12:00AM~',
@rowtotalsposition='end',@racheck='y',@shell='n'

Someletters2 Funct   Jan 2008   Feb 2008   Mar 2008   Apr 2008   May 2008   Jun 2008   Totals
------------ ------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
AA           SumMony 685.67     763.64     656.13     575.93     879.13     192.13     3752.63
AB           SumMony 927.06     928.98     280.20     632.43     560.99     785.50     4115.16
AC           SumMony 791.09     555.18     916.71     273.23     187.48     508.31     3232.00
AD           SumMony 250.04     341.58     426.53     645.56     670.13     422.86     2756.70
AE           SumMony 809.14     487.21     295.33     625.92     716.12     527.19     3460.91
.
.
ZY           SumMony 776.32     682.98     326.17     677.69     546.87     926.54     3936.57
ZZ           SumMony 532.75     500.92     277.93     636.40     607.05     553.43     3108.48
Totals       SumMony 433997.25  383211.70  411913.12  411878.29  425431.07  409809.47  2476240.90

Here some additional bells and whistles are thrown in:)
 
-- Executed in 26 secs in QA for S2005.
Exec Rac
-- The same transformed repeated twice for different purposes. 
@transform='Sum(SomeMoney) as SumMony & Sum(SomeMoney) as [% row]',
@rows='Someletters2',
@pvtcol='(left(datename(mm,DATEADD(mm,DATEDIFF(mm,0,SomeDate),0)),3)+~ ~+
          cast(year(DATEADD(mm,DATEDIFF(mm,0,SomeDate),0)) as char(4))) as mthyr',
@pvtsort='month(DATEADD(mm,DATEDIFF(mm,0,SomeDate),0))',
@from='##JBMTest',
@WHERE='SomeDate>=~Jan  1 2008 12:00AM~ AND SomeDate<~Jul  1 2008 12:00AM~',
@rowtotalsposition='end',@racheck='y',@pformat='_pvtcols_',@shell='n',@translabel='Summary',
-- Display min and max sum for each Someletters along with pivot (date) it occurred.
-- The min and max are displayed in separate rows. Default is same row for all rowfunctions.
@rowfunctions='min(SumMony) & max(SumMony)',@rowfunctionslabel='Min/Max',@displayrowfunctions='m',
-- Running sum of pivot columns for each row from left to right. The pivot sum is followed by the 
-- run in each cell.
@colruns='SumMony',
-- The percentage of the pivot sum/[row total] displayed in a separate row.
@cpercents='[% row] %only' -- a different transform alias to force a separate row.
                           -- We could display in same row as sum (and column runs).


Someletters2 Summary Min/Max                 Jan 2008            Feb 2008            Mar 2008             Apr 2008             May 2008             Jun 2008             Totals
------------ ------- ----------------------- ------------------- ------------------- -------------------- -------------------- -------------------- -------------------- ----------
AA           SumMony min(192.13,Jun 2008)    685.67/685.67       763.64/1449.31      656.13/2105.44       575.93/2681.37       879.13/3560.50       192.13/3752.63       3752.63
                     max(879.13,May 2008)                                                                                                                                 
             % row                           18.3%               20.3%               17.5%                15.3%                23.4%                5.1%                 -
AB           SumMony min(280.20,Mar 2008)    927.06/927.06       928.98/1856.04      280.20/2136.24       632.43/2768.67       560.99/3329.66       785.50/4115.16       4115.16
                     max(928.98,Feb 2008)                                                                                                                                 
             % row                           22.5%               22.6%               6.8%                 15.4%                13.6%                19.1%                -
AC           SumMony min(187.48,May 2008)    791.09/791.09       555.18/1346.27      916.71/2262.98       273.23/2536.21       187.48/2723.69       508.31/3232.00       3232.00
                     max(916.71,Mar 2008)                                                                                                                                 
             % row                           24.5%               17.2%               28.4%                8.5%                 5.8%                 15.7%                -
AD           SumMony min(250.04,Jan 2008)    250.04/250.04       341.58/591.62       426.53/1018.15       645.56/1663.71       670.13/2333.84       422.86/2756.70       2756.70
                     max(670.13,May 2008)                                                                                                                                 
             % row                           9.1%                12.4%               15.5%                23.4%                24.3%                15.3%                -
AE           SumMony min(295.33,Mar 2008)    809.14/809.14       487.21/1296.35      295.33/1591.68       625.92/2217.60       716.12/2933.72       527.19/3460.91       3460.91
                     max(809.14,Jan 2008)                                                                                                                                 
             % row                           23.4%               14.1%               8.5%                 18.1%                20.7%                15.2%                -
AF           SumMony min(406.49,May 2008)    788.30/788.30       415.40/1203.70      605.56/1809.26       613.81/2423.07       406.49/2829.56       520.40/3349.96       3349.96
                     max(788.30,Jan 2008)                                                                                                                                 
.
.
ZY           SumMony min(326.17,Mar 2008)    776.32/776.32       682.98/1459.30      326.17/1785.47       677.69/2463.16       546.87/3010.03       926.54/3936.57       3936.57
                     max(926.54,Jun 2008)                                                                                                                                 
             % row                           19.7%               17.3%               8.3%                 17.2%                13.9%                23.5%                -
ZZ           SumMony min(277.93,Mar 2008)    532.75/532.75       500.92/1033.67      277.93/1311.60       636.40/1948.00       607.05/2555.05       553.43/3108.48       3108.48
                     max(636.40,Apr 2008)                                                                                                                                 
             % row                           17.1%               16.1%               8.9%                 20.5%                19.5%                17.8%                -
Totals       SumMony min(383211.70,Feb 2008) 433997.25/433997.25 383211.70/817208.95 411913.12/1229122.07 411878.29/1641000.36 425431.07/2066431.43 409809.47/2476240.90 2476240.90
                     max(433997.25,Jan 2008)                                                                                                                              
             % row                           17.5%               15.5%               16.6%                16.6%                17.2%                16.5%                -
 

About Me

My photo
Phoenix, Arizona, United States