• No results found

• Secondly, we will identify the people who are associated with the design, development and operation of the DBMS software and system environment. We will call them the

N/A
N/A
Protected

Academic year: 2022

Share "• Secondly, we will identify the people who are associated with the design, development and operation of the DBMS software and system environment. We will call them the"

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)

UNIT II

IMPLEMENTATION OF DBMS

(2)

Roles in Database Environment (Database People)

• Firstly, we will identify the people whose jobs involve the day-to-day use of a large database system. We will call them the “Actors on the Scene” or “Database Users”.

• Secondly, we will identify the people who are associated with the design, development and operation of the DBMS software and system environment. We will call them the

“Workers behind the Scene”. These persons are typically

not interested in the database itself.

(3)

• We can identify four distinct types of database users:

– Data and Database Administrators – Database Designers

– Application Programmers or Application Developers

– End Users

Actors on the Scene (Database Users)

(4)

Data and Database Administrators

• The database and the DBMS are corporate resources that must be managed like any other resource.

• Data Administrator (DA) and Database Administrator (DBA) are the roles generally associated with the management and control of a DBMS and its data.

• Data Administrator is a person in the enterprise who has the central responsibility for the data. As stated earlier data is one of the most important resources of corporate, it is imperative that there should be some person who understands the data and the needs of the enterprise with respect to the data at a senior management level. The DA is that person.

• Thus it is the Data Administrator’s job to decide what

data should be stored in the database in the first place

and to establish policies for maintaining and dealing

with that data once it has been stored.

(5)

• An example of such a policy might be one that dictates the enterprise which data to be made public and which data to be kept as confidential.

• Note carefully that the Data Administrator is a manager (decision maker) and not a technician (although he or she certainly does need to have some appreciation of the capabilities of database system at a technical level).

• The technical person responsible for implementing the data administrator’s decisions is the Database Administrator (DBA).

The DBA unlike the Data Administrator is an IT professional.

• The job of DBA is to create the actual database and to

implement the technical controls needed to enforce various

policy decisions made by the DA (Data Administrator).

(6)

• The DBA is also responsible for ensuring that the DBMS operates with adequate performance and for providing a variety of technical services.

• The DBA is typically assisted by a team of Application Programmers and other technical persons.

• In small organizations sometimes there is no distinction

between the role of DA and DBA, in others they perform their

duties as stated above.

(7)

Database Designers

• Database designers are responsible for identifying the data to be stored in the database and for choosing appropriate structures to represent and store this data.

• In large design projects we can distinguish between two types of designer: logical database designers and physical database designers.

• Logical Database Designer is concerned with identifying the data (i.e. the entities and attributes), the relationships between the data, and the constraints on the data that is to be stored in the database. In essence it can be said that Logical Database Designer derives the Conceptual View (note that not the Conceptual Schema).

• The Physical Database Designer decides how the

logical database design created by logical database

designer is to be physically realized i.e. he decides

specific storage structures and access methods for the

data to achieve good performance. In essence it can be

said that the physical database designer derives the

Internal View (note that not the Internal Schema).

(8)

The tasks related to database designers are undertaken before the database is actually implemented and populated with data.

It is the responsibility of the database designers to communicate with all prospective database users, in order to understand their requirements and to come up with a design (certainly in consultation with the DA) that meets these requirements.

In many cases, the designers are on the staff of the DBA.

Database Designers typically interact with each potential group of

users and develop a view of the database that meets the data and

processing requirements of this group. These views are then

analyzed and integrated with the views of other user groups. The

final database design must be capable of supporting the

requirements of all user groups.

(9)

Application Programmers/ Developers

• Once the database has been implemented, the application programs that provide the required functionality to the end users or they satisfy the personal needs of the application programmers are developed.

• This is the responsibility of Application Programmers. To access the database they use some (3GLs) programming language such as C, C++, COBOL, Java etc or some higher- level “fourth generation” language (4GL).

• Such programs access the database by issuing the

appropriate request – typically an embedded SQL

statement – to the DBMS.

(10)

End Users

• The end-users are the clients for the database which has been designed and implemented, and is being maintained to serve their information needs.

• End users may be classified according to the way they use the system:

Naive Users:

• They are typically unaware of the DBMS. They access database through specially written application programs (by application programmers) which attempt to make the operations as simple as possible.

• They invoke database operations by entering simple commands or choosing options from a menu. This means that they do not need to know anything about the database or DBMS.

• The best example is the person booking seats on a

Railway Reservation Counter.

(11)

Sophisticated Users:

• At the other end of the spectrum, the sophisticated end-user is familiar with the structure of the database and the facilities offered by the DBMS.

• Sophisticated end-users may use a high-level query language, such as SQL to perform the required operations.

• Some sophisticated end-users may even write application

programs for their own use.

(12)

Workers Behind the Scene

In addition to those who design, use and administer the database, others are associated with the design, development and operation of the DBMS software and system environment.

These persons are typically not interested in the database itself. We call them the “Workers Behind the Scene”, and they include the following categories:

DBMS System Designers and Implementers

are persons who design and implement the DBMS modules and interface as a software package. DBMS is a complex software and it consists of many components or modules.

Tool Developers

include persons who design and implement

tools – the software packages that facilitate database system

design and use, and help improve performance. Tools are

optional packages that are purchased separately.

(13)

Operators and Maintenance Personnel are the

system administrator personnel who are responsible

for the actual running and maintenance of the

hardware and software environment for the

database system. Clearly they are nothing but the

System Administrators.

(14)

Specific Responsibilities of the DBA

• As stated earlier, the Data Administrator (DA) is the person who makes the strategic and policy decisions regarding the data of the enterprise, and the Database Administrator (DBA) is the person who provides the necessary technical support for implementing those decisions.

• Clearly, the DBA is responsible for the overall control of the system at a technical level. Here we will explore some of the responsibilities of the DBA in more detail.

Defining the Conceptual Schema:

- It is the Data Administrator’s job to decide exactly what information is to be held in the database. Certainly he does this in consultation with the Logical Database Designer. In other words the DA finally decides the entities of interest to the enterprise and decides the information to be recorded about these entities.

(15)

- This process is usually referred to as logical or conceptual database design. The outcome of the logical database design phase is the contents of the database at an abstract level (view level).

- The DBA then creates the corresponding conceptual schema using the DDL.

- The object (compiled) form of the schema will be used by the DBMS in responding to access request.

- The source (un-compiled) form will act as a reference document for the users of the system.

Defining the Internal Schema:

- The DBA must also decide (definitely in consultation with the physical database designer) how the data is to be represented in the stored database. This process is usually referred to as physical database design.

(16)

- Having done the physical design, the DBA must then create the corresponding storage structure definition (i.e. the Internal Schema), using the DDL or SDL.

- In addition he or she must also define the associated conceptual/ internal mapping. In practice either the DDL or the SDL (most likely the former) will probably include the means for defining the mapping, but the two functions (creating the schema and defining the mapping) should be clearly separable.

- Like the conceptual schema, the internal schema and corresponding mapping will exist in both source and object form.

Interacting with the Users:

- It is the business of the DBA to interact/communicate with the users to ensure that the data they need is available to them and to write (or help the users to write) the necessary external schemas using the DDL.

(17)

- In addition the corresponding external/ conceptual mappings must also be defined. In practice the external DDL will probably include the means for specify those mappings, but once again the schemas and mappings should be clearly separable.

- Each external schema and corresponding mapping will exist in both source and object form.

- Other aspects of the user liaison function include consulting on application design, providing technical education, assisting with problem determination and resolution and similar professional services.

Defining Security and Integrity Constraints:

- As stated earlier security and integrity constraints are regarded as part of the conceptual schema. The DDL must include facilities for specifying such constraints.

(18)

Defining dumb and reload policies:

- Once the enterprise is committed to a Database System, it becomes critically dependent on successful operation of that system.

- In the event of damage to any portion of the database – caused by human error or a failure in the hardware or operating system – it is essential to be able to repair the data concerned with the minimum of delay and with as little effect as possible on the rest of the system. For example the availability of data that has not been damaged should ideally not be affected.

- The DBA must define and implement an appropriate damage control scheme, typically involving the following:

• Periodic unloading or dumping of the database to backup storage.

• Reloading the database when necessary from most recent dump.

(19)

- The need for quick data recovery is one reason why it might be a good idea to spread the total data collection across several databases, instead of keeping it all at one place. The individual database might very well form the unit for dumb and reload purposes.

- It goes without saying that such VLDB (Very Large Databases) require very careful and sophisticated administration, especially when there is a requirement for continuous availability (which there usually is).

Monitoring Performance and Responding to Changes in Environment:

- The DBA is responsible for organizing the system in such a way as to get the performance that is “best for enterprise”, and for making the appropriate adjustments – i.e. tuning – as requirements change.

(20)

– For example it might be necessary to reorganize the stored database from time to time to ensure that performance levels remain acceptable.

– As already stated any change to the physical storage (internal) level of the system must be accompanied by a corresponding change to the definition of the conceptual/ internal mapping, so that the conceptual schema can remain constant.

(21)

DBMS Languages

A data sublanguage (DSL) consists of two parts – a Data Definition Language (DDL) and a Data Manipulation Language (DML).

The DDL is used to specify the database schema and the DML is used to both read and update the database.

These Languages are called the data sublanguages because

normally (in present day DBMSs) they do not include constructs for all computing needs such as conditional or iterative statements, which are provided by the high level programming languages.

Many DBMSs have the facility for embedding the data sublanguage in a high level programming language (User Exits) such as COBOL, C, C++, Java etc. In this case the high level language is sometimes

referred to as the host language.

(22)

The Data Definition Language (DDL)

• The database schema is specified by a set of definitions expressed by means of a special language, called the data definition language.

• The DDL is used to define a schema or to modify an existing one. It can’t be used to manipulate the data.

• The result of the compilation of the DDL statements is a set of tables stored in a special file collectively called the system catalog.

• The System Catalog integrates the meta data that is data that describe objects in the database and makes it easier for those objects to be accessed or manipulated.

• The meta data contains definitions of records, data

items and other objects that are of interest to the user

or are required by the DBMS.

(23)

The Data Manipulation Language (DML)

• Data manipulation operations usually include the following:

Insertion of the new data in database.

Modification of data stored in database.

Retrieval (Query) of data contained in the database.

Deletion of data from the database.

• Therefore one of the most important functions of DBMS is to support data manipulation language in which the user can construct statements that will cause the above data manipulation to occur.

• Data Manipulation at higher levels namely at

conceptual level and at the external levels is

obvious. But what sort of data manipulation at

internal level??

(24)

Procedural or Low Level DML

With a procedural DML the user or more precisely the programmer specifies what data is needed and how to obtain it.

This means that the user must express all the data access operations that are to be used by calling appropriate procedures to obtain the information required.

This type of DML typically retrieves individual records or objects from the database and processes each separately.

Clearly these DMLs need to use the programming language constructs such as looping to retrieve and process each record from a set of records.

Typically they are embedded with conventional High Level

Programming Languages so that they could be able to use the

native constructs of a High Level Language. Some of them

provide their own constructs to serve the purpose.

(25)

• They are also called record-at-a-time DMLs because of their nature of processing a record at a time.

• Network and Hierarchical DMLs are normally procedural.

Example:

IBM’s IMS (Information Management System)

was the leading DBMS based on Hierarchical

Data Model. Its Data Manipulation Language

was DL/1 and it was procedural DML.

(26)

Non-procedural Data Manipulation Languages

Non-procedural DMLs allow the required data to be specified in a single retrieval or update step.

With these sort of DMLs the user specifies what data is required without specifying how it is to be obtained.

In case of non-procedural DMLs the DBMS translates a DML statement into a procedure (or set of procedures) that manipulates the required set of records. This frees the user from having to know how data structures are internally implemented and what algorithms are required to retrieve and possibly transform the data, thus providing users with a considerable degree of data independence.

They are also known as declarative languages or set-at-a-time or set-oriented DMLs.

Relational Database Management System (RDBMS) usually

include some form of non-procedural DMLs typically SQL or

QBE (Query by Example). They are easy to learn and use as

compared to Procedural DMLs.

(27)

Example:

Consider following SQL statement executed on the given table STUDENT <ENo,RNo,Name,Fname,Hall,Fee>

SELECT * FROM STUDENT WHERE FEE>2000.00

When above statement is executed it displays the information of all students whose fee is greater than Rs.

2000. Thus the result is a set of records and not a single

record. Thus SQL is non-procedural DML.

(28)

Fourth Generation Languages (4GLs)

• There is no consensus about what constitute a fourth generation language; it is in essence a shorthand programming language.

• An operation that requires hundreds of lines in a conventional programming language (3GL) like C, generally requires significantly fewer lines in a 4GL.

• Compared with a 3GL which is procedural, a 4GL is non-procedural. The user only defines what is to be done and not how it is to be done.

• A 4GL is expected to rely largely on much higher-level components known as fourth generation tools.

• The user doesn’t define the steps that a program needs to perform a task, but instead defines parameters for the tools that use them to generate an application program.

• It is claimed that 4GLs can improve the productivity by

a factor of ten.

(29)

• Following languages may be kept in the category of Fourth generation languages :

Presentation Languages, such as query languages and report generators.

Specialty Languages such as spreadsheets and database languages.

Application Generators that define, insert, update and retrieve data from the database to build applications (The data control object of Visual Basic).

Very high level languages that are used to generate application code (Visual Basic itself)

• SQL (Structured Query Language), QBE (Query by Example), Data Control Object of VB, VB itself,

Oracle’s Form, Oracle’s Menu and Oracle’s Report writer, JDBC etc. all are examples of 4GLs.

• Some of them are being discussed in the next few

slides.

(30)

Form Generators

A forms generator is an interactive facility for rapidly creating data inputs and display layouts for screen forms.

The forms generator allows the user to define what the screen is to look like, what information is to be displayed, and where on the screen it is to be displayed.

It may also allow the definition of colors for screen elements and other characteristics, such as bold, underline and blinking etc.

The better forms generator allows the creation of derived attributes using the arithmetic operators and the specification of validation checks for data input.

A very simple form is shown below

(31)
(32)

Report Generators

• A Report Generator is a facility for creating reports from data stored in the database.

• It is similar to a query language in that it allows the user to ask questions of the database and retrieve information

from it for a report. However, in the case of a report generator we have much greater control over what the output looks like.

• We can let the report generator automatically determine how the output should look like or we can create our own customized output reports using special report generator command instructions.

• There are two main types of Report Generators:

Language Oriented Visually Oriented

(33)

• In the first case we enter a command in a sublanguage to define what data is to be included in the report and how the report is to be laid out.

• In the second case we use a facility similar to a form generator

to define the same information.

(34)

Graphics Generators

A graphics generator is a facility to retrieve data from the database and display the data as a graph showing trends and relationships in the data (OLAP: Online Analytical Processing).

Typically it allows the user to create bar charts, pie charts, scatter charts etc. from the data stored in the database.

Application Generators

An application generator is a facility for producing a program that interfaces with the database. The use of an application generator can reduce the time it takes to design an entire software application.

Application generators typically consist of pre-written modules that comprise fundamental functions that most database oriented programs use.

An example of an application generator is VB data control object.

(35)

Components (Structure) of the DBMS

• DBMSs are highly complex and sophisticated pieces of software. It is not possible to generalize the components (structure) of a DBMS as it varies greatly from system to system.

• However, it is useful when trying to understand database systems to try to view the components of a DBMS and the relationships among them.

• Here we shall examine a possible structure for a DBMS.

(36)
(37)

• DBMS is partitioned into modules that deal with each of the responsibilities of the overall system. The functional components of DBMS can be broadly divided into two categories:

- Query Processor Components.

- Storage Manager Components.

• The Query Processor Components include:

• DDL Interpreter:

- The DDL interpreters data definition language statements and records them in a set of tables containing metadata. Clearly DDL statements are interpreted and corresponding table definitions and constraints are stored in the data dictionary. The output of DDL interpreter is in a form that can be used by other components of the DBMS.

- Clearly the output is in object form.

(38)

• DML Compiler:

- It translates DML statements in a query language into low level (object code) instructions that the Query Evaluation Engine understands.

- In addition the DML compiler attempts to translate a user’s request into an equivalent but more efficient form thus decides a good strategy for executing the query. This is known as query optimization.

• Embedded DML precompiler:

- It converts DML statements embedded in an application program to normal procedure calls in the host language. The precompiler must interact with the DML compiler to generate the appropriate code.

- The resultant host language program is run with the host language compiler to generate application program object code, which are utilized by application programmers and Query Evaluation Engine.

(39)

• Query Evaluation Engine:

- It executes low level instructions (object code) generated by the DML compiler or by the HLC (Host Language Compiler).

Storage manager components - provide the interface between the low level data stored in the database and the application programs and queries submitted to the system. The storage manager components include:

• Authorization & Integrity Manager:

- It tests for the satisfaction of integrity constraints and checks the authority of users to access data.

• Transaction Manager:

- It ensures that the database remains in a consistent (correct) state despite system failures, and that concurrent transaction executions proceed without conflicting.

(40)

• File Manager:

- It manages the allocation of space on disk storage and the data structures used to represent information on the disk. The file manager may be a part of OS or it may be written specifically for the DBMS.

• Buffer Manager:

- It is responsible for fetching data from the disk manager to main memory and deciding what data to cache in memory.

In addition to the above, several data structures are required as part of the physical system implementation:

• Indices (Access Aids):

- It improves the performance of the DBMS. A set of access aids in the form of indices is usually provided with a DBMS. Commands are provided to build, set and destroy the indices.

(41)

• Data Dictionary:

- A data dictionary keeps track of the definitions of all the data items in the database. It is a table of data elements including at least the names, data types and lengths of every data item in the subject database.

- The data dictionary can be viewed as being part of the database itself. Thus the database is self describing since it contains information describing its own structure.

- The information in the data dictionary is called ‘metadata’ or

‘data about data’. The metadata are available for query and manipulation just as other data in the database.

(42)

• Statistical Data:

- It stores statistical information about the data in the database.

This information is used by the query processor to select efficient ways to execute a query.

• Disk Manager:

- The disk manager is part of the operating system and all the physical input and output operations are performed by it. The disk manager transfers the block requested by the file manager so that the file manager need not be concerned with the physical characteristics of the underlying storage media.

(43)

We now discuss the steps when a user wants to access a record or a set of records from the database (consider the following figure):

1. A user’s request for the data is received by the query processor which determines the physical record(s) required. This decision requires some preliminary consultation with the data dictionary.

2. The query processor sends the request for a specific physical record(s) to the storage manager. The storage manager decides which physical block(s) of secondary storage devices contain the required record(s).

A block is a unit of physical I/O operations between primary and secondary memory.

3. The disk manager retrieves the physical block(s) required by the storage manager and sends it back to the storage manager, which sends it to the query processor.

(44)

DBMS User Interface

Query Processor

Storage Manager

Disk Manager

Database User's Query

Response to User

Request Specific Records

Requested Record(s)

Request Specific Block (s)

Requested Block(s)

Input/ Output Block(s)

Block(s) from Secondary

Storage

References

Related documents

Percentage of countries with DRR integrated in climate change adaptation frameworks, mechanisms and processes Disaster risk reduction is an integral objective of

The Congo has ratified CITES and other international conventions relevant to shark conservation and management, notably the Convention on the Conservation of Migratory

The occurrence of mature and spent specimens of Thrissina baelama in different size groups indicated that the fish matures at an average length of 117 nun (TL).. This is sup- ported

INDEPENDENT MONITORING BOARD | RECOMMENDED ACTION.. Rationale: Repeatedly, in field surveys, from front-line polio workers, and in meeting after meeting, it has become clear that

3 Collective bargaining is defined in the ILO’s Collective Bargaining Convention, 1981 (No. 154), as “all negotiations which take place between an employer, a group of employers

Section 2 (a) defines, Community Forest Resource means customary common forest land within the traditional or customary boundaries of the village or seasonal use of landscape in

Theories of learning and development of behavior Hierarchy of needs by Maslow(1954).. Social learning theory by Bandura(1963) Classical conditioning by Pavlov(1927) Operant

Based on the assumption that revenue from additional carbon pricing would be transferred back to households as lump-sum payments, we estimate that the level of real GDP in 2030