Friday, February 24, 2012

Handling Data Integrity Issues in SQL2000

Hello,
I’m looking for some help with analysis services.
I ahve a very simple fact table which has 2 columns
Account Number and Investment Objective.
Fact table
Account NumberInvestment Objective
12345678A
22222222A
33333333B
44444444X
A dimension is needed for the investment objective
So I have a lookup table which is
ValueDescription
AGrowth
BNo-Growth
There is not an X value in the lookup table so when I look at the count for
accounts I only get 3. The account 44444444 never shows up.
Is there a way to have 44444444 or any other account that might get a value
not in the lookup table to fall into an ‘Unknown’ type description?
I understand the best way to solve is to make sure I have a value in the
lookup table for every value that is in the investment objective fact, the
problem is I cant control what might get added to it, and we want to be able
to have an unknown description and have everything that falls out of the
range of the lookup go into that. This will allow the users of the cube to
find the bad entries and fix them.
Of course this is sample data and the real tables have millions of records
and 100’s of columns, but I think the basic concept applies.
Any help or a direction to go in would be greatly appreciated. BTW this is
SQL2000
Thanks
Hello AppDev,
I am assuming you are using a Star schema in a SQL Server database
somewhere. The Star Schema is used as a basis for the OLAP cube you
are creating.
Remember Analysis Services requires all dimension records to be unique
and fact records to have a relationship with all dimensions. This means
Fact records that do not have a correlating dimension record will not
appear in your cube. (This only happens when you included dimensions
that do not have a relationship with all fact records)
There is a quick solution to this. Create an 'Unknown' record in
your dimension table. When populating the fact table you can return the
unknown surrogate key value to join to the unknown dimension record.
This will make unknown records appear in your cube.
Hope this helps
Myles Matheson
Data Warehouse Architect
|||(microsoft.public.sqlserver.olap is a better newsgroup for a posting like
this, but here goes)
My first gut feel is that you should not be allowing this to occur. This is
basic RI between a fact table and the dimension. You should be processing
your dimension to pickup additions prior to processing the fact table. This
will ensure that this doesn't occur if RI is in-place on the RDBMS.
The row is disappearing from the fact table because the default SQL
statement is an inner join between the fact table and the dimension table.
Thus the row will not be returned to Analysis Services at all . . . we
simply don't see it. The RDBMS eliminates it before we get it.
In SQL2K, you best option is to create an UNKNOWN member in the base
dimension and then load your fact data through a view. In the view use a
CASE clause with an EXISTS and replace the FK being returned based on
whether or not that key exists. If it doesn't exist, then return the UNKNOWN
member.
In SQL2K5, the system supports an unknown member directly and you can load
data w/ an error configuration which tells it to assign invalid FKs with the
system generated unknown member directly.
Dave Wickert [MSFT]
dwickert@.online.microsoft.com
Program Manager
BI SystemsTeam
SQL BI Product Unit (Analysis Services)
This posting is provided "AS IS" with no warranties, and confers no rights.
"appdevtech" <appdevtech@.online.nospam> wrote in message
news:44BBA514-66F1-4FDD-A0D2-11E6BB6D2F13@.microsoft.com...
> Hello,
> I'm looking for some help with analysis services.
> I ahve a very simple fact table which has 2 columns
> Account Number and Investment Objective.
> Fact table
> Account Number Investment Objective
> 12345678 A
> 22222222 A
> 33333333 B
> 44444444 X
> A dimension is needed for the investment objective
> So I have a lookup table which is
> Value Description
> A Growth
> B No-Growth
> There is not an X value in the lookup table so when I look at the count
> for
> accounts I only get 3. The account 44444444 never shows up.
> Is there a way to have 44444444 or any other account that might get a
> value
> not in the lookup table to fall into an 'Unknown' type description?
> I understand the best way to solve is to make sure I have a value in the
> lookup table for every value that is in the investment objective fact, the
> problem is I cant control what might get added to it, and we want to be
> able
> to have an unknown description and have everything that falls out of the
> range of the lookup go into that. This will allow the users of the cube
> to
> find the bad entries and fix them.
> Of course this is sample data and the real tables have millions of records
> and 100's of columns, but I think the basic concept applies.
> Any help or a direction to go in would be greatly appreciated. BTW this
> is
> SQL2000
> Thanks
>
|||Thanks Dave,
My gut feeling is to fix the RI issues first also, but in this case I can
use the cube to identify the issues, versus catching the data during import
then writing a report to notify the users of the inconsistent data.
When you say "create an UNKNOWN member in the base dimension” do you mean
adding an additional record to my dimension table?
Value Description
A Growth
B No-Growth
X Unknown
Then in the view if it does not exist, return X(Unknown)?
"Dave Wickert [MSFT]" wrote:

> (microsoft.public.sqlserver.olap is a better newsgroup for a posting like
> this, but here goes)
> My first gut feel is that you should not be allowing this to occur. This is
> basic RI between a fact table and the dimension. You should be processing
> your dimension to pickup additions prior to processing the fact table. This
> will ensure that this doesn't occur if RI is in-place on the RDBMS.
> The row is disappearing from the fact table because the default SQL
> statement is an inner join between the fact table and the dimension table.
> Thus the row will not be returned to Analysis Services at all . . . we
> simply don't see it. The RDBMS eliminates it before we get it.
> In SQL2K, you best option is to create an UNKNOWN member in the base
> dimension and then load your fact data through a view. In the view use a
> CASE clause with an EXISTS and replace the FK being returned based on
> whether or not that key exists. If it doesn't exist, then return the UNKNOWN
> member.
> In SQL2K5, the system supports an unknown member directly and you can load
> data w/ an error configuration which tells it to assign invalid FKs with the
> system generated unknown member directly.
> --
> Dave Wickert [MSFT]
> dwickert@.online.microsoft.com
> Program Manager
> BI SystemsTeam
> SQL BI Product Unit (Analysis Services)
> --
> This posting is provided "AS IS" with no warranties, and confers no rights.
>
> "appdevtech" <appdevtech@.online.nospam> wrote in message
> news:44BBA514-66F1-4FDD-A0D2-11E6BB6D2F13@.microsoft.com...
>
>
|||Hello Myles,
For example, you could add a new record in the dimension table with
"value"="unknown". You could create a view fact1 and use this as the fact
table when you create a cube
create view fact1 as
select f.AccountNumber,
case when exists (select d.value from dim1 d where d.value=f.objectvalue )
then f.[Investment Objective]
else 'unknown'
end
as InvestmentObjective,
f.number
from fact f
Regards,
Peter Yang
MCSE2000/2003, MCSA, MCDBA
Microsoft Online Partner Support
When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.
================================================== ===
This posting is provided "AS IS" with no warranties, and confers no rights.
| Thread-Topic: Handling Data Integrity Issues in SQL2000
| thread-index: AcVzZJNkPr6fSXC8RMqPvlgseXtMiA==
| X-WBNR-Posting-Host: 12.155.246.10
| From: "=?Utf-8?B?YXBwZGV2dGVjaA==?=" <appdevtech@.online.nospam>
| References: <44BBA514-66F1-4FDD-A0D2-11E6BB6D2F13@.microsoft.com>
<#BtbY5rcFHA.1324@.tk2msftngp13.phx.gbl>
| Subject: Re: Handling Data Integrity Issues in SQL2000
| Date: Fri, 17 Jun 2005 10:47:05 -0700
| Lines: 98
| Message-ID: <6CC7E76E-3E1C-49BE-B976-71D0D694B5AB@.microsoft.com>
| MIME-Version: 1.0
| Content-Type: text/plain;
| charset="Utf-8"
| Content-Transfer-Encoding: 8bit
| X-Newsreader: Microsoft CDO for Windows 2000
| Content-Class: urn:content-classes:message
| Importance: normal
| Priority: normal
| X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
| Newsgroups: microsoft.public.sqlserver.datawarehouse
| NNTP-Posting-Host: TK2MSFTNGXA03.phx.gbl 10.40.2.250
| Path: TK2MSFTNGXA01.phx.gbl!TK2MSFTNGXA03.phx.gbl
| Xref: TK2MSFTNGXA01.phx.gbl microsoft.public.sqlserver.datawarehouse:1828
| X-Tomcat-NG: microsoft.public.sqlserver.datawarehouse
|
| Thanks Dave,
| My gut feeling is to fix the RI issues first also, but in this case I can
| use the cube to identify the issues, versus catching the data during
import
| then writing a report to notify the users of the inconsistent data.
|
| When you say "create an UNKNOWN member in the base dimension” do you
mean
| adding an additional record to my dimension table?
|
| Value Description
| A Growth
| B No-Growth
| X Unknown
|
| Then in the view if it does not exist, return X(Unknown)?
|
| "Dave Wickert [MSFT]" wrote:
|
| > (microsoft.public.sqlserver.olap is a better newsgroup for a posting
like
| > this, but here goes)
| >
| > My first gut feel is that you should not be allowing this to occur.
This is
| > basic RI between a fact table and the dimension. You should be
processing
| > your dimension to pickup additions prior to processing the fact table.
This
| > will ensure that this doesn't occur if RI is in-place on the RDBMS.
| >
| > The row is disappearing from the fact table because the default SQL
| > statement is an inner join between the fact table and the dimension
table.
| > Thus the row will not be returned to Analysis Services at all . . . we
| > simply don't see it. The RDBMS eliminates it before we get it.
| >
| > In SQL2K, you best option is to create an UNKNOWN member in the base
| > dimension and then load your fact data through a view. In the view use
a
| > CASE clause with an EXISTS and replace the FK being returned based on
| > whether or not that key exists. If it doesn't exist, then return the
UNKNOWN
| > member.
| >
| > In SQL2K5, the system supports an unknown member directly and you can
load
| > data w/ an error configuration which tells it to assign invalid FKs
with the
| > system generated unknown member directly.
| > --
| > Dave Wickert [MSFT]
| > dwickert@.online.microsoft.com
| > Program Manager
| > BI SystemsTeam
| > SQL BI Product Unit (Analysis Services)
| > --
| > This posting is provided "AS IS" with no warranties, and confers no
rights.
| >
| >
| > "appdevtech" <appdevtech@.online.nospam> wrote in message
| > news:44BBA514-66F1-4FDD-A0D2-11E6BB6D2F13@.microsoft.com...
| > > Hello,
| > > I'm looking for some help with analysis services.
| > >
| > > I ahve a very simple fact table which has 2 columns
| > > Account Number and Investment Objective.
| > >
| > > Fact table
| > > Account Number Investment Objective
| > > 12345678 A
| > > 22222222 A
| > > 33333333 B
| > > 44444444 X
| > >
| > > A dimension is needed for the investment objective
| > > So I have a lookup table which is
| > >
| > > Value Description
| > > A Growth
| > > B No-Growth
| > >
| > > There is not an X value in the lookup table so when I look at the
count
| > > for
| > > accounts I only get 3. The account 44444444 never shows up.
| > > Is there a way to have 44444444 or any other account that might get a
| > > value
| > > not in the lookup table to fall into an 'Unknown' type description?
| > >
| > > I understand the best way to solve is to make sure I have a value in
the
| > > lookup table for every value that is in the investment objective
fact, the
| > > problem is I cant control what might get added to it, and we want to
be
| > > able
| > > to have an unknown description and have everything that falls out of
the
| > > range of the lookup go into that. This will allow the users of the
cube
| > > to
| > > find the bad entries and fix them.
| > >
| > > Of course this is sample data and the real tables have millions of
records
| > > and 100's of columns, but I think the basic concept applies.
| > >
| > > Any help or a direction to go in would be greatly appreciated. BTW
this
| > > is
| > > SQL2000
| > > Thanks
| > >
| >
| >
| >
|

No comments:

Post a Comment