************************************************************
DSS News
D. J. Power, Editor
September 29, 2002 -- Vol. 3, No. 20
A Bi-Weekly Publication of DSSResources.COM
************************************************************
Check the Applebee's case featuring Teradata
************************************************************
Featured:
* DSS Wisdom
* Ask Dan! - What is the potential size of the NSEERS database?
* What's New at DSSResources.COM
* DSS News Releases
************************************************************
DSS News is sent to more than 875 subscribers from 50
countries. Please forward DSS News to people interested
in Decision Support Systems and suggest they subscribe.
************************************************************
DSS Wisdom
Newell and Simon (1972) argued human problem solving can be understood
"by describing the task environment in which it takes place; the space
the problem solver uses to represent the environment, the task, and the
knowledge about it that he gradually accumulates; and the program the
problem solver assembles for approaching the task (p. 868)."
from Newell, A. and H. A. Simon, Human Problem Solving, Englewood
Cliffs, NJ: Prentice-Hall, Inc., 1972.
************************************************************
Enhance model-driven DSS with Crystal Ball simulation software.
Download a FREE evaluation at www.crystalball.com/dss
************************************************************
Ask Dan!
by Daniel J. Power
What is the potential size of the NSEERS database?
The U.S. Immigration and Naturalization Services (INS) is creating a
number of very large databases to support a variety of operations and
processes. In addition to the NSEERS transaction database discussed two
weeks ago (DSS News, Vol. 3, No. 19), INS is developing the Student and
Exchange Visitor Information System (SEVIS). SEVIS is an Internet-based
system that will be accessed at U.S. Ports of Entry and by more than
1900 schools and colleges. Biometric border crossing cards will also be
required of Mexican border crossers as of October 1, 2002. These
projects are massive in scope and it is not clear how data will be
shared between systems. This Ask Dan! continues the discussion.
How do we "size" these databases? Is a data warehouse needed? What
platform and software is needed? How will this data collection effort
support decision making at INS? Some of these questions were addressed
in my prior column, but Marc Demarest, former Chairman and CEO
DecisionPoint Applications and current President of Noumenal
(http://www.noumenal.com/marc/), offered his analysis and insights and I
accepted. Marc began with my assumptions for the NSEERS transaction
processing system -- 35 million visitors per year, 45 KB for a
fingerprint, 10 KB for a photo and 5 KB for alphanumeric string data per
visitor.
Marc writes "I think we need to double or triple your alphanumeric
string count. For US citizens and green card holders, the 5K is probably
right, but have you seen the forms foreign nationals have to fill out?
They're huge, and, after all, this is John Ashcroft we're talking about,
so, by the time we add in the foreign national's 'home information' and
all of the data on where the foreign national will be traveling in the
US and what they will be doing and who they will be seeing, I'd bet the
foreign national alpha information tops 20K easily. So let's say 12K as
an average for alpha data."
Second "I didn't see a discussion of what would be captured on exit, but
something will surely be captured, yes? That would be the easiest way to
catch visa overstays. And if Ashcroft is worried about people
masquerading as other people, he'll capture about as much on the way out
as he did on the way in. Let's say a photo, a fingerprint, and 5K on the
way out."
"Now, how would such an application work? The 'decision support' is
going to be largely automated, I'd imagine. I'd assume this application
is going to (a) compare photographs, (b) compare fingerprints, (c)
analyze the alpha data provided and then (d) weight the outcomes and (e)
make a recommendation to the official at point-of-capture based on that
data. That is the system model (at least I hope it is). Now, they may
also do other things with the data, including uploading it into other
(INS, NSA, DCA) systems for other kinds of analysis, but this is a
closed loop capture-and-analyze system, I'd bet, not two systems: (1) a
transaction processing app and (2) a DSS app that is loaded from the TP
app according to a schedule."
"Since the data in the database is analyzed programmatically and not by
a person, it doesn't have to be inherently legible at the schema level,
so we're probably not talking a 'star' style schema -- we're talking
some kind of normal form schema. Raw data loaded into any schema creates
a loaded set size larger than the raw data -- because of DBMS storage
mechanisms, indexing overhead, etc. They'll have to be using a
conventional RDBMS because this is a INSERT-AND-QUERY system: Teradata-
and nCube-style DBMSs and OLAP engines won't take the INSERTs fast
enough or elegantly enough. For a star implemented in a conventional
RDBMS, one usually sees a 2.5X growth in the raw load set size once it's
loaded, and I think it'd be close to the same growth factor in this
case: maybe 2.8X. We're also not talking about needing to maintain a lot
of history in this system -- the system-of-record for pictures and
fingerprints will be elsewhere, because that system will be used for
multiple purposes, and data will be migrated out of this system into the
real 'data warehouse' for the Department of Homeland Security as soon as
an individual alien's entry-exit loop is closed, so I'd bet there will
never be more than the equivalent of 1 year's worth of data in the
system. In other words, this system won't be extracted INTO; it will be
extracted FROM."
As an aside Marc noted "The most difficult technical bit for the system
will be indexing strategy: the more indexes they add to cut query time,
the longer insert time will take. The fewer the indexes, the longer the
complex set of queries they are going to have to run will take."
Based on his assumptions and analysis, Marc calculated NSEERS is "about
a 12 TB system ... easily within the range of Oracle running on a nice
cluster of Sun or IBM SMP/NUMA boxes." Marc concludes "You're right,
however, that in the final analysis it will be hard to implement."
Thanks Marc for letting me quote so extensively from your analysis. A
12 Terabyte database is huge and the more I reflect on the INS projects
the more I can see the databases expanding in size. Perhaps these two
Ask Dan! columns will stimulate more thinking and discussion about the
important decision support issues associated with monitoring visitors to
the United States. These new INS systems are mission critical and the
projects provide us the opportunity to think innovatively about
providing decision support from very large databases. As always your
comments and questions are welcomed. If you want a challenge, reflect on
how you would would support decision making at INS. If you teach DSS or
database, try asking your students the questions raised in DSS News,
Vol. 3, Nos. 19 and 20.
References
Demarest, Marc, Email message, Monday, September 16, 2002 at 09:38:31.
Power, D. J., "Is it feasible to track all visitors to the United States
and then build a Data-driven DSS?" DSS News, Vol. 3, No. 19, September
15, 2002.
************************************************************
Visit DSS News Sponsors - Crystalball.com and Teradata.com
************************************************************
What's New at DSSResources.COM
09/22/2002 Added materials to Power, D. J. "A Brief History of Decision
Support Systems", saved as version 2.1, September 2002, URL
DSSResources.COM/history/dsshistory.html.
09/19/2002 Posted case by Teradata Staff, "Understanding customers'
preferences at Applebee's International", Teradata, a division of NCR
Corporation, 2002, URL DSSResources.COM/cases/.
************************************************************
Get information about Dan Power's book, Decision Support
Systems: Concepts and Resources for Managers, at
http://www.dssresources.com/dssbookstore/power02.html .
************************************************************
DSS News Releases - September 15 to September 27, 2002
Complete news releases can be found at DSSResources.COM.
09/27/2002 Call for Papers: ICEIS 2003 - 5th International Conference on
Enterprise Information Systems, Angers, France 23-26 April, 2003. Paper
deadline October 15, 2002.
09/26/2002 Microsoft unveils the Center for Information Work.
09/26/2002 Q&A: What is the Microsoft Center for Information Work?
09/26/2002 Stellent unveils vision for the future of content management.
09/26/2002 Harrah's selects TIBCO for business integration platform.
09/26/2002 Application outsourcing the most efficient and cost effective
method of software implementation, IDC system dynamic models prove.
09/25/2002 New Network Computing study finds third party remote access
providers reduce management burdens, save money.
09/24/2002 Schwan's selects Intermec handheld computers and mobile
printers for nationwide route sales.
09/24/2002 The emergence of the Internet "Mainframe" -- the WebFrame --
will drive infrastructure growth says NetsEdge Research Group.
09/24/2002 Ford selects SGI Reality Center technology for visualization
and design optimization.
09/24/2002 Sun Microsystems is honored with the Helen Keller Achievement
Award for its leadership in accessibility advancements.
09/23/2000 Leading analyst research finds Business Objects number one
business intelligence tools vendor in Western Europe.
09/23/2000 Jones & Stokes uses eRoom hosted enterprise service to manage
environmental planning projects.
09/23/2000 Sybase introduces first comprehensive healthcare integration
suite built on open standards.
09/23/2000 Teradata profitability analytics bolsters the bottom line for
mobile communications companies.
09/23/2000 Netezza unleashes tera-scale data appliance for Business
Intelligence.
09/23/2000 Teradata signs worldwide reseller agreement with Informatica;
will empower customers with integrated decision-making by linking
operational and analytic capabilities.
09/23/2002 GeoSpatial World 2003 enhances exhibitor opportunities to
reach GIS, IT, and mapping decision makers.
09/23/2002 ProClarity Corp. fastest growing Business Intelligence vendor
named to Software Magazine's 20th Annual Software 500.
09/23/2002 China's largest coal company deploys Datastream enterprise
asset management solution.
09/23/2002 Nortel Networks introduces SSL-based secure extranets for
enterprise customers; enables secure connectivity for remote users
equipped with web browsers.
09/23/2002 Microsoft delivers new migration and coexistence tools for
Lotus Notes applications.
09/23/2002 Wrigley selects SAP as global business systems platform.
09/23/2002 International Biometric Group releases Biometric Market
Report 2003-2007.
09/23/2002 SAS(R) solution adapters for SAP reduce time to Business
Intelligence.
09/20/2002 Call for Participation: 2003 Information Resources Management
Association (IRMA) International Conference. Submission Deadline:
October 4, 2002.
09/20/2002 Extreme Networks demos production 10 gigabit ethernet
infrastructure for next generation network services at Sun Conference.
09/20/2002 Optiant customer Imation wins Start Magazine's Technology and
Business Award.
09/19/2002 CIOs report on Information Technology's hottest jobs in
semi-annual Robert Half survey.
09/19/2002 Forrester Research launches ninth TechRankings category:
Business Process Management.
09/18/2002 Advanced weather model running on SGI systems used to predict
dispersion of hazardous aerosols and gases.
09/17/2002 SAS Enterprise Miner to support PMML; SAS and IBM simplify
the management and deployment of data mining.
09/17/2002 Applix announces first Applix Integra customer win,
additional wins for Applix iTM1, Applix iEnterprise.
09/17/2002 Decisioneering attacks corporate market with new services
leader, Dr. Johnathan Mun.
09/17/2002 To create `paperless' office, Doctors started from scratch:
opened new office with software provided via Internet.
09/17/2002 New MindManager add-in enables teams to build visual project
plans - then export to leading project management tools.
09/17/2002 Cognos prescribes business intelligence solution for Markham
Stouffville Hospital.
09/17/2002 Oracle(R) Java and web services tools leadership confirmed by
developer community and industry press.
09/16/2002 Information Builders announces a no-fee SEVIS compliance
analysis for higher education institutions.
09/16/2002 AMR Research reports SAS leads business
intelligence/analytics market.
09/16/2002 Nobilis Software announces Nobilis Ci ProcessWriter for the
desktop.
************************************************************
Visit DSS News Sponsors - Crystalball.com and Teradata.com
************************************************************
DSS News is copyrighted (c) 2002 by D. J. Power. Please send your questions to daniel.power@dssresources.com. You have previously
subscribed to the DSS News Mailing List.
|