Showing posts with label hive. Show all posts
Showing posts with label hive. Show all posts

Wednesday, April 7, 2021

Introduction to Parallel Processing with Map Reduce Algorithm in Big Data Technology

 

Introduction to Parallel Processing with Map Reduce

                                                   

* Parallel Processing system is a clever and robust method to process huge amounts of data within a short period of time .


* This is ensured by enlisting the services of several computing devices to work  on different parts of a job simultaneously


* An ideal parallel processing system can work across multiple computational problems using any number of computing devices across any size of data sets with ease and high programming productivity


* Parallel Programming can be achieved in such a way that it can be broken down into many parts such that each of the parts can be partially processed independently of the other parts and then processing the intermediate results from processing the parts which can be combined to produce a final solution .


* Infinite parallel processing is the most important essence of the laws of nature

Tuesday, March 2, 2021

SQL - Structured Query Language ( a basic course overview & revision on the paradigms and features of SQL as a Relational Database Management Language )

 

      SQL - Structured Query Language

                                                    ( a basic overview )

 




·          The backbone of any Query Based Language is Relational Algebra which provides a concise , formal notation for representation of queries . However , commercial database systems required a query language which would be much user friendly and for this purpose SQL short for Structured Query Language was created which was the most influential commercially marketed query language .

 

·         SQL uses a combination of relational algebra and relational calculus constructs for its construction and implementation

 

·         Although we refer to SQL language as a "query language" , SQL can do much more than just query a database . SQL can define the structure of the data , modify the data in the database and it can also specify security constraints . This means that SQL can present to its user a brief highlight of the size of the database that the user is working upon . SQL can also be used to define any database and also the tables that a user wants to create and maintain over a database residing in the system as well as server memory . Also , on top of that SQL can be used to modify the structure of the database over which other data reside . Also , SQL can be used to specify the various security constraints that need to be defined over a database which would make the data specific to the users it is meant for by provisioning the access specifiers for access limitation as well as defining the needed integrity constraints which would ensure the ACID properties of a suitable relational database management system .

 

·          In this article a complete User's Guide and How to Use / Implement a SQL installation won't be discussed nor is it intended for in this basic article but the aim of the article is to present SQL's fundamental constructs and concepts for brief usage by any User who is not only new to SQL but may be an intermediate or highly skilled versatile user of the language .

 

·         Background of SQL - IBM developed the original version of SQL which was originally called as "Sequel" ( Any of the present day readers of the article can perhaps think of the word Sequel as some sort of movie sequel kind of the unlimited sequels of the Star Wars series .. seems a poor joke ) as part of the System R project of the early 1970's . The Sequel language has evolved since then , and its name has changed to SQL ( Structured Query Language ) . Many applications and products in the form of standalone and server based products now implement SQL as a part of their database engine and thus over the years SQL has become the supreme rdbms package which is not only widely used but also the most user friendly product among all the users of the current database users .

 

·          In 1986 , the American National Standards Institute ( ANSI ) and the International Organisation of Standardisation ( ISO ) published an SQL standard which used to be called as SQL-86 at that point of time ( the postfix 86 succeeding SQL obviously comes from the manufacturing year of 1986 ) . Following this the standardisation organisation ANSI published an extended standard for SQL , SQL -89 in the year 1989 . The next version of the standard was then published in the year 1992 and thus the version was therefore called as SQL-92 standard , and then later versions followed like SQL-1999 etc . The most recent version that I have worked upon and also is installed over my system is the SQL-2003 version which serves most of the purpose . Apart from some server connection issues and importing data from other incompatible transactional data formats , I have not faced any major hurdles while using the present version of standalone SQL installation , However , if one wants to read a full scale bibliographic notes reference of these standards , one may read them in the documentation notes of the accompanying installation CD or executable software .

 

·         SQL language has the several parts namely :

 

1) Data Definition Language (DDL)

The SQL DDL provides commands for defining the relation schemas , deletion of the relations and modification of the relation schemas .

 

2) Interactive Data Manipulation Language (DML )

 The SQL DML includes a query language based on both the relational algebra and the tuple relational calculus . Which means that one can enter queries over a table and find out the necessary information from the database . The search query is mostly written in a language which implements the core

principles and working of relational algebra and relational calculus as their underlying coding platforms

 

3) Integrity

The SQL DDL includes commands for specifying integrity constraints that the data stored in the database must satisfy. All the necessary updates that violate the integrity constraints are not allowed over the query engine of SQL.

 

4) View Definition

The SQL DDL includes commands for defining the views upon a database. Views are nothing but a sub-part representation of a main table present over a database and it consists of only the structure of the database with filtered or unfiltered data from the main table .However any form of DML statements cannot be made to work upon a view.

 

5) Transaction Control

SQL includes commands for specifying the beginning and ending of the transactions . The broader meaning of this statement says about the way in which stored procedures , cursors and triggers could be used for the sake of making and breaking any transaction at the desired point of time

 

6) Embedded SQL and Dynamic SQL

Embedded and Dynamic SQL defines the manner in which SQL statements can be embedded within general purpose programming languages such as C , C++ , Java , PL/I , Cobol , Pascal and Fortran .

 

7) Authorization

The SQL DDL includes commands for specifying access rights to relations and views . This means that SQL provides its users with the potential to access the tables present over a given / created / hosted database specific rights for access and views to its data within the database . Many database systems support most of the SQL-92 standards and some of the new constructs which are present in SQL : 1999 and SQL : 2003 although no databases provide non-standard features which defer from the detailed and provided features of SQL as provided in the standard specification manual of the different release versions of SQL .