In order for an information
system to be effective, it must provide users with accurate, timely, and
relevant information that is free of errors, available to decision makers when
it is needed, and useful and appropriate for the types of work and decisions
that require it. Information systems
arrange data in computer files in a hierarchy that starts with bits and bytes
and progresses to fields, records, files, and databases. The traditional approach to file processing
encourages each department or area in a company to develop their own systems
and data files, known as specialized applications. These applications require a unique data file
and their own computer program to operate.
Over time this leads to data that is difficult to maintain and manage. This results in in data redundancy and
inconsistency, program-data dependence, processing inflexibility, poor data
security, and lack of data sharing and availability.
Database technology has
evolved to reduce the many problems of the traditional file organization. A database is defined as a collection of data
organized to serve many applications efficiently by centralizing the data and
controlling redundant data. A single
database can service multiple applications. A database management system (DBMS) is a type
of software that allows an organization to centralize data, manage them
efficiently, and provide access to the stored data by application
programs. This minimizes redundant and
inconsistent files.
The most common type of DBMS
used for PCs and larger computers and mainframes is the relational DBMS. These databases organize data in
two-dimensional tables called relations and each table consists of rows and
columns. As long as two tables share a
common data element, the relational database tables can be combined easily to
deliver data required by users.
Object-oriented databases
are used to handle graphics-based or multimedia applications. This DBMS stores the data and procedures that
act on those data as objects that can be automatically retrieved and
shared. They can also store more complex
types of information that relational DBMS, however they are somewhat slow for
processing large numbers of transactions compared to relational DBMS.
A DMBS has capabilities and
tools for organizing, managing, and accessing the data in a database. A data definition is a capability that
specifies the structure of the content of the database. It is used to create database tables and to
define the characteristics of the fields in each table. This information would be documented in a
data dictionary. A data dictionary is
capability of an automated or manual file that stores definitions of data
components and their characteristics. A
third capability is the data manipulating language. A data manipulation language that is a
specialized language in most DBMS that is used to add, change, delete, and
retrieve the data in a database. It
contains commands that allow end users and programmers to extract data from the
database to satisfy information requests and develop applications. The most prominent data manipulation language
used today is Structured Query Language (SQL).
In order to create a
database, the relationships among the data, the type of data that will be
maintained in the database, how the data will be used, and how the organization
will need to change to manage data from a company-wide perspective must be
clearly understood. A database requires
a conceptual, or logical, design and a physical design. The conceptual design is an abstract model of
the database from a business perspective.
It describes how the data elements in the database are to be grouped to
meet business information requirements.
The physical design shows how the database is actually arranged on
direct-access storage devices.
Databases are used by
businesses to keep up with their day-to-day activities in addition to providing
information that will help the company run more efficiently, and help managers
and employees make better decisions.
Special capabilities and tools are required for analyzing large
quantities of data and for accessing data from multiple systems. One capability is data warehousing. A data warehouse is a database that stores
current and historical data of potential interest to decision makers throughout
the company. It makes the data available
for anyone to access as needed, but it cannot be altered. A data mart is a subset of data warehouses
where a summarized or highly focused portion of the organization’s data is
placed in a separate database for a specific set of users. It typically focuses on a single subject area
or line of business.
After the data is in data
warehouses and data marts, it is available for further analysis using tools for
business intelligence, such as multidimensional data analysis and data
mining. These tools enable users to
analyze data to see new patterns, relationships, and insights that are useful
to assist in decision making. Online
Analytical Processing (OLAP) is the capability for manipulating and analyzing
large volumes of data from multiple perspectives, i.e., using multiple
dimensions. Data mining finds hidden
patterns and relationships in large databases and deduces rules from them that
are used to guide decision making and forecast the effect of those decisions.
A third capability for
analyzing large quantities of data are the tools used for accessing internal
databases through the Web. Text mining
tools are able to extract key elements from large unstructured data sets,
discover patterns and relationships, and summarize the information. Web mining is the discovery and analysis of
useful patterns and information from the Web.
Once a database is set up,
special policies and procedures for data management will need to be set into
place. This ensures that the data for
the business remains accurate, reliable, and readily available to those who
need and use it. All businesses need an
information policy. This will specify
the organization’s rules for sharing, disseminating, acquiring, standardizing,
classifying, and inventorying information.
These policies lay out the specific procedures and accountabilities,
identifying which users and organizational units can share information, where
information can be distributed, and who is responsible for updating and maintaining
the information. In addition, additional
steps must be taken to ensure that the data in organizational databases are
accurate and remain reliable through audits and cleansing.