Data Science and AI Quest: sql

Showing posts with label sql. Show all posts

Wednesday, March 17, 2021

Cartesian Product Operation in SQL and RDBMS - concept discussion with example

Cartesian Product Operation

* The Cartesian product operation is denoted by a cross ( X ) which allows us to combine information from any two relations .

* One can write the Cartesian Product of Relations R1 and R2 as R1 x R2

* A relation is by definition a subset of a Cartesian product of a set of domains .

* From the definition , one can have an intuition about the definition of the Cartesian product operation .

* Since the same attribute name may appear in both R1 and R2 , one may need to devise a naming schema to distinguish between the attributes .

* One can do the attachment to an attribute , the name of the relation from which the attribute originally came from .

* For example , the relation schema for

r = borrower * loan is :

(

borrower.customer_name ,

borrower.loan_number,

loan.loan_number ,

loan.branch_name ,

loan.amount

)

* With the schema , one can distinguish between borrower.loan_number from loan.loan_number as both the attributes do have the same name but the main relational table for the table are different which are loan and borrower

* The naming convention for any of the relations or schemas requires that the relation should have distinct names

* In general , if we have two relations r1(R1) and r2(R2) , then r1 x r2 is a relation whose schema is the concatenation of relations R1 and R2 .

Relation R contains all the tuples t for which there is a tuple t1 in relation r1 ; and a tuple t2 in r2 for which t|R1| = t|R1| and t|R2| = t2|R2|

* Suppose we want to find the names of all the customers who have a loan at the "PerryRidge" branch , then one may need the information in both the loan relation and the borrower relation through the given selection statement .

<< FIG - 01 >>

From the given statement , one can find that a selection of attribute "branch-name" can be done with a relation existing from the relational tables for borrower and the loan relational table but given that the branch_name is "PerryRidge" over here .

* From the above relation , if one wants to find if there is a cartesian product operation that associates every tuple of loan with every tuple of borrower , with the customer having a loan in the “PerryRidge” branch , then there is some tuple in borrower x loan that contains the name of the customers which can be obtained by

criterion / condition from one table borrower to table loan where borrower.loan_number = loan.loan_number

Union Operation in RDBMS - Fundamental Relational Algebra Concept

Union Operation - RDBMS

Fundamental Relational Algebra Concept

=======================================

* Scenario where a Union Operation over a rdbms table could be used : -

Consider a query to find the names of all the bank customers who have either an account or a loan or both in a bank .

* To find the names of the customer who have an account and also a deposit account in the bank , the search query would consider the search to take over "depositor" relation table and the "borrower" relation table .

* In order to find out the names of all the customers with a loan in the bank , the query that could be used for the operation would be :

{figure-01}

The upper relational equation is a form of projection which allows us to produce the relevant relation that returns the argument relation specified within the parenthesis . So as we are interested in finding out the names of the customers who have a loan account , the

relevant result would be fetched from the projection relation as mentioned in the above statement .

* Similarly , if we want to know the names of all the customers with an account in the bank , then it can be expressed in the following projection equation :

{figure-02}

* Therefore , in order to find the answer to the raised question in our scenario that is discussed in the opening statement , one needs to do a "Union" operation over the two sets which is : we need all the customer names that appear in either both or two of the relations which can be found out by using the binary operation Union upon both the queries . The relevant relational equation can be represented in the given manner :

{figure-03}

* The resulting relation for the query would result in a relation with the relevant tuples from both the tables

* The resulting relations are sets from which duplicate values are eliminated .

In Hindsight :

In our example , we took the union of two sets , both of which consisted of the attribute "customer_names" values . In general , one must ensure that unions are taken between compatible relations which would generate the appropriate results without duplicates.

* Points to note when using a Union Operation over relational sets :

If r is a set and s is set , and one needs to find { r U s } , If r is a set and s is set , and one needs to find { r U s } , then the conditions that should satisfy and hold for both relations

1) The relations r and s must be of the same type and they must have the same number of attributes

2) The domains of the ith attribute of r and ith attribute of s must be the same for all values of i . One may note that , both r and s are either database relations or temporary relations that are the result of relational algebra expressions as given in figure 1 and figure 2 of the article .

Monday, March 15, 2021

Self Variable in _init_ constructor

Self Variable in _init_ constructor

* 'Self' is a default variable that contains the memory address of the instance of the current class which is under current usage .

* So , one can use 'self' instance variable to refer to all the instance variables and Instance Methods .

* When an instance of the class is created , the

instance name contains the memory location of the instance . The memory location is internally passed to the 'self' . For example , one can create an instance of the Student Class in the given manner :

S1 = Student()

* Here , 's1' contains the memory address of the instance . This memory address is internally and by default passed to the 'self' variable .

* Since the 'self' knows the memory address of the instances , it can refer to all the members of the instance .

* One can use 'self' in the following ways :

1) The 'self' variable is used as a first parameter within the constructor as :

def _init_(self):

In this case , 'self' can be used to refer to the instance variables inside the constructor .

2) 'self' can be used as a first parameter in the instance method as :

def talk(self):

* Here , talk() is an instance method as it acts on the instance variables present within the Class and defined under the _init_(self): method .

* If the method wants to act on the instance variables , then the method should know the memory location of the instance variables . That memory location is by default available to the talk() method through the 'self' method .

Friday, March 12, 2021

Introductory Concept of Abstraction in Python

Introductory Concept of Abstraction in Python

* In languages like Java , there are several keywords like private , public and protected in order to implement the various levels of abstraction . These keywords are called as Access Specifiers .

* In Python , such kind of Access Specifiers are not available .

* Everything written inside the class comes under the category of public – that means all the objects , methods , variables present within any class in Python can be considered to be coming under Public Category which means that everything which is written inside a class is available outside the class to other people .

* Suppose , one does not want to make a variable outside the class or to other members inside the class , then we can write the variable with two double scores before it as : _var . This is like a private variable in Python .

* In the following example , 'y' is a private variable since it is written as : __y

class MyClass:

def _init_(self):

self.__y = 3

* Now , it is not possible to access the variables from within the class or outside of the class as :

m = MyClass()

print(m.y)

* The preceding print() statement displays an error message as : AttributeError :'MyClass' object has no attribute 'y' .

* Even though one can not access the private variable in this way .. however , it is possible to access it in the format :

instancename.__Classname_var

* This means that we are using the Classname differently to access the private variable . This concept is known as name mangling . In name mangling , one needs to use one underscore before the classname and two underscores after the classname . Like this , using the names differently to access the private variables is called as name mangling .

* For example , in order to display a private variable 'y' value one can write the following code :

print(m._MyClass_y)

The same statement can be written inside the method as :

print(self._MyClass_z) . When we use single underscore before a variable as _var , then that variable or object will not be imported into other files .

Wednesday, March 3, 2021

Cardinality Ratio concept in "Database and Management Systems" with explanatory figurative example

Cardinality Ratio concept in DBMS

Cardinality Ratio

It is the number of relationship instance that an entity can participate in . If one tries to understand the relationship between Student and Guide , then the relationship between the Student entity and the guide entity can be described in the following entity - relationship diagram .

Considering the given case , one can observe and try to understand the given relationship with the help of a Entity Relation Diagram with the help of a Set Diagram .

Here , some scenarios emerge like the given case :

Case-1

One guide (G1) can guide two students (S1) and (S2) whereas (S1) can only be guided by guide (G1) .

Case-2

Third Student (S3) can be guided by guide (G2) .

This is a kind of restriction set by the relationship where both the entity sets are mutually associated to each other by the relation between them . So here , the number of instance objects relation among each other is restricted and it can be also observed how the cardinality relationship is mapping one particular instance to another particular instance object through the given relationship diagram which depicts the way the entities can participate in .

From the above one can get a pretty good understanding of what a Cardinality Ratio is : -

Definition of Cardinality Ratio

The number of relationship instances that an entity can participate in is called as the Cardinality Ratio .

From the above diagram it can be noted that , one department would have only one HOD . So , in this case , the relationship would be only 1 is to 1 .

From the diagram we deduce that :

Relationship R1 exists from Department D1 to HOD H1 .

Relationship R2 exists from Department D2 to HOD H2 .

Relationship R3 exists from the Department D3 to HOD H3 .

A scenario over where such a type of relationship exists where there is only one relationship mapping from one Set's instance object to another Set's instance object is called a One to One relationship .

One can get a better understanding of this through the help of a E-R diagram shown at the bottom of the above figure .

===================================================================

The second type of relationship that exists is called One to Many Relationship

In the given figure , one can notice that there is a relationship existing between many departments and one student . Each department will have multiple students and thus one can notice from the given relation diagram that multiple relations exist from one department to student of another set but the student would be associated with only one department .

This is an instance of Many to One Relationship which is depicted by the Ratio form of representation (1:M) .which is another form of cardinality ratio expressed in the form of Many to One relation .

This means that Many Instances of any particular Entity Type will be associated or will be participating in the "Has" relationship .

Many to Many Relationship

In the given example , if one can see then one would be able to determine that there is Many to Many Relationship between the students set and Subjects set .

This can be rightfully depicted in the form , Relation between a Student on the left hand side and Subject on the right hand side of the Relationship Diagram . One can notice that multiple instance objects belonging to the set "Student" bear a many to many relationship between students of the other entity set which is the "Subject" set .

The depiction of the relation has been done through the help of an E-R diagram .

===================================================================

Now , we can get to understand the behaviour of these relationships that depict the manner in which relationships exist between instance objects of one set with another or multiple other sets in the article's showcased manner

Tuesday, March 2, 2021

Basic Domain Types Supported by SQL

SQL - Structured Query Language ( a basic course overview & revision on the paradigms and features of SQL as a Relational Database Management Language )

SQL - Structured Query Language

( a basic overview )

· The backbone of any Query Based Language is Relational Algebra which provides a concise , formal notation for representation of queries . However , commercial database systems required a query language which would be much user friendly and for this purpose SQL short for Structured Query Language was created which was the most influential commercially marketed query language .

· SQL uses a combination of relational algebra and relational calculus constructs for its construction and implementation

· Although we refer to SQL language as a "query language" , SQL can do much more than just query a database . SQL can define the structure of the data , modify the data in the database and it can also specify security constraints . This means that SQL can present to its user a brief highlight of the size of the database that the user is working upon . SQL can also be used to define any database and also the tables that a user wants to create and maintain over a database residing in the system as well as server memory . Also , on top of that SQL can be used to modify the structure of the database over which other data reside . Also , SQL can be used to specify the various security constraints that need to be defined over a database which would make the data specific to the users it is meant for by provisioning the access specifiers for access limitation as well as defining the needed integrity constraints which would ensure the ACID properties of a suitable relational database management system .

· In this article a complete User's Guide and How to Use / Implement a SQL installation won't be discussed nor is it intended for in this basic article but the aim of the article is to present SQL's fundamental constructs and concepts for brief usage by any User who is not only new to SQL but may be an intermediate or highly skilled versatile user of the language .

· Background of SQL - IBM developed the original version of SQL which was originally called as "Sequel" ( Any of the present day readers of the article can perhaps think of the word Sequel as some sort of movie sequel kind of the unlimited sequels of the Star Wars series .. seems a poor joke ) as part of the System R project of the early 1970's . The Sequel language has evolved since then , and its name has changed to SQL ( Structured Query Language ) . Many applications and products in the form of standalone and server based products now implement SQL as a part of their database engine and thus over the years SQL has become the supreme rdbms package which is not only widely used but also the most user friendly product among all the users of the current database users .

· In 1986 , the American National Standards Institute ( ANSI ) and the International Organisation of Standardisation ( ISO ) published an SQL standard which used to be called as SQL-86 at that point of time ( the postfix 86 succeeding SQL obviously comes from the manufacturing year of 1986 ) . Following this the standardisation organisation ANSI published an extended standard for SQL , SQL -89 in the year 1989 . The next version of the standard was then published in the year 1992 and thus the version was therefore called as SQL-92 standard , and then later versions followed like SQL-1999 etc . The most recent version that I have worked upon and also is installed over my system is the SQL-2003 version which serves most of the purpose . Apart from some server connection issues and importing data from other incompatible transactional data formats , I have not faced any major hurdles while using the present version of standalone SQL installation , However , if one wants to read a full scale bibliographic notes reference of these standards , one may read them in the documentation notes of the accompanying installation CD or executable software .

· SQL language has the several parts namely :

1) Data Definition Language (DDL)

The SQL DDL provides commands for defining the relation schemas , deletion of the relations and modification of the relation schemas .

2) Interactive Data Manipulation Language (DML )

The SQL DML includes a query language based on both the relational algebra and the tuple relational calculus . Which means that one can enter queries over a table and find out the necessary information from the database . The search query is mostly written in a language which implements the core

principles and working of relational algebra and relational calculus as their underlying coding platforms

3) Integrity

The SQL DDL includes commands for specifying integrity constraints that the data stored in the database must satisfy. All the necessary updates that violate the integrity constraints are not allowed over the query engine of SQL.

4) View Definition

The SQL DDL includes commands for defining the views upon a database. Views are nothing but a sub-part representation of a main table present over a database and it consists of only the structure of the database with filtered or unfiltered data from the main table .However any form of DML statements cannot be made to work upon a view.

5) Transaction Control

SQL includes commands for specifying the beginning and ending of the transactions . The broader meaning of this statement says about the way in which stored procedures , cursors and triggers could be used for the sake of making and breaking any transaction at the desired point of time

6) Embedded SQL and Dynamic SQL

Embedded and Dynamic SQL defines the manner in which SQL statements can be embedded within general purpose programming languages such as C , C++ , Java , PL/I , Cobol , Pascal and Fortran .

7) Authorization

The SQL DDL includes commands for specifying access rights to relations and views . This means that SQL provides its users with the potential to access the tables present over a given / created / hosted database specific rights for access and views to its data within the database . Many database systems support most of the SQL-92 standards and some of the new constructs which are present in SQL : 1999 and SQL : 2003 although no databases provide non-standard features which defer from the detailed and provided features of SQL as provided in the standard specification manual of the different release versions of SQL .

Data Science and AI Quest

Thursday, July 22, 2021

Concept of "NoSQL Databases" in Big Data Analytics with 7 sample conceptual questions and answers | Infographic Note with explanation