Data Science and AI Quest

Friday, April 30, 2021

Validation of Machine Learning Algorithms and Scenarios - A short article

Validation of Machine Learning Codes

* It is a widely accepted fact that just having some examples in the form of datasets and machine learning algorithm at hand does not assure that solving a machine learning problem is possible or the results would provide any desired solution

* For example ... if one wants a computer to distinguish a photo of a dog from a photo of a cat , one can do it with good examples of dogs and cats . One can then train a dog versus Cat classiﬁer which is based on some machine learning algorithms that would output the probability that a given photo is that of a dog or a cat . All of the times for a set of photos resembling a given photo , the output would be in the form of a validation quantity which would be expressing some level of accuracy for a number which would reﬂect how well the classiﬁer algorithm was able to perform those computations and with what level of alacrity and accuracy . I am using the alacrity which should convey to the reader about the performnace and speed aspect of the identiﬁcation process of the Machine Learning algorithm when computed upon a batch of photos for ﬁnding resemblance over a batch of photos of classes of photos by doing all forms of stucturisation like segmentation and clustering , KNN etc . And when it comes to the factor of accuracy one can think of the degree and magnitude in terms of percentage of resemblance of the referenced sample to the sample over which the matching is to be calculated .

* Based on the probability which is exressed in percentage accuracy , one can then decide whether the class ( that is if a dog or a cat) is based on the estimated probability as calculated by the algorithm .

* Whenever the obtained probability or percentage would be higher for a dog , one can minimize the risk of making a supposed wrong assessment by choosing the higher chances which would be favouring the probability of ﬁnding a dog .

* The greater the probability difference between the likelihood of a dog against that of a cat , the higher would be the conﬁdence that one can have in their choices of ﬁnding any appropriate result

* And in case , the probability difference between the likelihood of a dog against that of a cat , here it can be assumed that the picture of the subject is not clear or probably the subjects in the picture bear much resemblance in features which would indirectly mean that some of the pictures of the cats are similar to that of the dogs and because of which a confusion may arise and lead to another supposition that whether the dogs are cattish in the concerned pictures .

* On the point of training a classiﬁer :

When you pose a problem and offer the examples , with each of the examples being carefully marked with the label or class that the algorithm should learn ; then the computer trains the algorithm for a while and then ﬁnally one would get a resulting model out of the training process of the model over the dataset .

* Here , your computer trains the algorithm for a while and ﬁnally one would get a resulting model for the answer which provides one with an answer or probability .

* Labellling is an another associated activity that can be carried out but in the end a probability is just an opportunity to propose a solution and get an answer

* At such a point , one may have addressed all the issues and perhaps might guess that the work is ﬁnished , but still one may validate the results for ensuring that the results generated are ﬁrst comprehensible to the human , make sure that the user is able to have a clear understanding of the involved background processes and break-up analysis of the code and the result which can enable other readers to understand the code along with numbers

* More over this would be elaborated in the forthcoming sessions / articles where we will look into the various modes in which the machine learning results could be validated and made comprehensible to the users

Last modiﬁed: 16:39

Wednesday, April 28, 2021

Exploring Cost Functions in ML

* The driving force behind the concept of optimization in machine learning is the response from a function which is internal to the algorithm which is called as a Cost Function

* One may see other terms used in some contexts , such as loss function , objective function , scoring function , or error function but the cost function is an evaluation function that measures how well the machine learning algorithm maps the target function that the function was striving to guess

* In addition , a cost function determines how well a machine learning algorithm performs in a supervised prediction or an unsupervisd optimisation problem

* The Evaluation function works by comparing the algorithm predictions against the actual outcome recorded from the real world .

* Comparing a prediction against a real value using a cost function which determines the algorithm's error level

* Since it is a mathematical formulation , a general cost function expresses the error level in a numerical form thereby keeping the errors low . This means that the cost function modulates according to the parameters of a function in order to tune the produced output to a more numeric form thereby keeping the errors of the overall output to a low .

* The cost function transmits whatever is actually important and meaningful for the purposes of the learning algorithm

* As a result , when considering a scenario like stock market forecasting , the cost function expresses the importance of avoiding incorrect predictions . In such a case , one may want to make some money by avoiding any sort of big losses . In forecasting sales , the concern is different because one needs to reduce the error in common and frequent situations and not in the rare and exceptional cases , as one uses a different cost function .

* Example -- While considering stock market forecasting , the cost function expresses the importance of avoiding incorrect predictions . In such a case , one may want to make some money by avoiding big losses

* When the problem is to predict who would likely become ill from a certain disease , then for this also algorithms are in place that can score a higher probability of singling out the people who have the same characteristics and actually did become ill at a later point of time . And based on the severity of the illness , one may also prefer that the algorithm wrongly chooses some people who do not get ill , rather misses out on the people who actually get ill

* So after going through the given aspects on the usability of cost functions and how they are coagulated with some ML algorithms in order to ﬁne tune the result .we will get to see and check the method of Optimisation of a Cost Function and how and why they are done

* Optimisation of Cost Functions :

It is widely accepted as a conjecture that the cost function associated with a Machine Learning generic function is what truly drives the success of a machine learning application . This is an important part of the process of representation of an associated cost function that is the capabilty of an algorithm to approximate certain mathematical functions and along with that do some necessary optimisation which means how the machine learning algorithm sets their internal parameters .

* Most of the machine learning algorithms have their own optimisation which is associated with their own cost functions which means some of the better developed and advanced algorithms of the time are well capable enough to ﬁne tune their algorithms on their own and can come at a best optimised result at each step of the formulation of machine learning algorithms . This leaves the role of the user as futile some of the times as the role of the user to ﬁne tune the learning process and preside over the aspects of learning are not so relevant .

* Along with such , there are some algorithms that allow you to choose among a certain number of possible functions which provide more ﬂexibility to choose their own course path of learning

* When an algorithm uses a cost function directly in the optimisation process , the cost function is used internally . As the algorithms are set to work with certain cost functions , the objective of the optimisation problem may differ from the desired objective .

* And as the algorithms set to work with some of the cost functions , the optimisation objectives may also differ from the desired objective . In such a circumstance where the associated cost function could be used , one can call an error function or a loss function .. an error function is where the value needs to be minimised ; and the reverse of it is called a scoring function if the objective for the function is to maximise the result .

* With respect to one's target , a standard practice is to deﬁne the cost function that works best in solving the problem and then to ﬁgure out which algorithm would work best in the optimisation of the algorithm in order to deﬁne the hypothesis space that one would like to test . When someone works with algorithms that do not allow the cost function that one wants , one can still indirectly inﬂuence their optimisation process by ﬁxing their hyper-parameters and selecting your input features with respect to the cost function . Finally , when someone has gathered all the algorithm results , then one may evaluate them by using the chosen cost function and then decide over the ﬁnal hypothesis with the best result from the chosen error function .

* Whenever an algorithm learns from a dataset ( combination of multiple data arranged in attribute order ) , the cost function associated with that particular algorithm guides the optimisation process by pointing out the changes in the internal parameters that are the most beneﬁcial for making better predictions . This process of optimisation continues as the cost function response improves iteration by iteration with a process of improvised learning which of course is a result of iterative learning of the algorithm . When the response stalls or worsens , it is time to stop tweaking the algorithm's parameters because the algorithm is not likely to achieve better prediction results from there on . And when the algorithm works on new data and makes predictions , the cost function helps to evaluate whether the algorithm is working correctly

In Conclusion .. Even if the decision on undertaking any particular cost function is an underrated activity in machine learning , it is still considered as a fundamental task because this determines how the algorithm behaves after learning and how the algorithm would handle the problem that one would like to take up and solve . It is suggested that one should not go with the default options of cost functions but rather should ask oneself what should be the fundamental objective of using such a cost function which would yield their appropriate result

Last modiﬁed: 00:26

The Learning Process of M.L Algorithms

* During the process of optimization , the machine learning algorithm searches the possible variants of parameter combinations in order to ﬁnd the best one which would allow the correct mapping between the features and the classes during the process of training

* This process evaluates many potential candidate target fuunctions from among those which a learning algorithm can guess

* The set of all the potential functions that the learning algorithm can ﬁgure out is called a Hypothesis Space

* One can call the resulting classiﬁer with their set of parameters as a Hypothesis , which is a way in machine learning to say that the algorithm has set parameters to replicate the target function and is thus now ready to work out correct classiﬁcations

* The hypothesis space space must contain all the parameter variants of all the machine learning algorithms that one may want to try to map to an unknown function when solving a classiﬁcation problem . This particular sentence suggests that the entire sample space takes into consideration , a hypothesis space which would contain all the possible variations in the form of scenarios over where the machine learning algorithm could manifest itself at each point of time under the conditions upto which a particular program has been evaluated till a particular point of time and from which the Machine Learning algorithm would do a self analysis on its own for ﬁnding the best possible approach for a given condition or problem . This is an instance example of a condition to showcase how a machine learning algorithm would be doing a self analysis for a possible condition and then take the best possible course of action basing upon its own understanding and derived results .So , elaborating more upon the aspect of hypothesis space .. one can deduce that a hypothesis space generally consists of a target function or a similar approximation which is much different for a similar function .

* The equivalent of this could be thought of as the time when a child in an effort to ﬁgure out an image of a tree experiments with many different creative ideas by assembling one's own knowledge and experiences . Most certainly , parents play a major role in this learning phase and they provide all kinds of relevant environmental inputs for the faster and effective upbringing of the child . In Machine Learning , for say in supervised learning algorithms one has to provide the right learning algorithms and with that one has to provide some non-learnable parameters called as hyper-parameters , next one has to choose a set of examples to learn and adapt from and then select the features that accompnay the examples . And just as a child cannot always learn to distinguish between right and wrong if left alone in the world ( consider the example of the case depicted in the book - Lord of the Flies ; summary is available at may sites where one can have a quick synopsis of the story and save time from reading the entire book which in these days is a very tedious , demanding and unproductive task ). In such a similar scenario as well , a machine learning algorithm also needs multiple directions , multiple interjections in order to facilitate the smooth running and execution of a program .

* So even after the completion of the learning process , a machine learning classiﬁer often cannot unequivocally map the examples to the target classiﬁcation because many false and erroneous mappings are possible which could mar the generation of best possible results and then render the learning process ineffective as the learning algorithm in its path to effective learning picks up erroneous and wrong paths and lands up adding insufficient data points to discover the right function . In addition to this , conditions of noise ( this aspect is also a great factor in machine learning ) also affect the process of learning

* In real world as well , Noise plays a same kind of impediment factor in the process of learning which derides the effective learning mechanism . Similarly , many such extraneous factors and errors also occur which during the process of recording of the data which distort the values and features to be read and understood . In true sense , therefore it is considered that a good machine learning algorithm should distinguish the signals that can map back to a target function even though extraneus environmental noise is still in play .

Last modiﬁed: 27 Apr 2021

Monday, April 26, 2021

Learning Process of Machine Learning Algorithms - a precursor article

* Even though Supervised Learning is the most popular and frequently used algorithms among all the learning processes , all the machine learning algorithms respond to the same logic that is reading of miniscule or multiple sets of data at a time and ﬁnd meaningful patterns from the cited parameteric dataset , which will also ﬁnd out the best contributing features from the data and then ﬁnd out if any applicable models from the data

* The central most idea for a learning process is that one can represent reality using a mathematical function which the algorithm doesn't know in advance but will comprehend from the data and then can guess some of the important ﬁndings and predictions from the data . This concept is the core idea for all kind of machine learning algorithms

* As witnessed from several readings , all of the experts on the subject of machine learning do put their word on the reliability of Supervised Machine Learning and Classiﬁcation as the most pivotal of all the learning types and provides explanations of the inner functioning which one can extend to other types of machine learning approaches as well

* The objective of the supervised learning classiﬁer is to assign a class to an example after having examined some of the characteristics of the example . Such characteristics are called as "features" and they are both quantitative (numeric values) or qualitative(string labels) .

* In order to assign classes correctly , a classiﬁer must ﬁrst examine a certain number of known examples correctly , where the classiﬁer must ﬁrst examine a certain number of known examples closely ( example that one can already have a class assigned to them ) , where each one of the algorithms is accompanied by the same kinds of features as the examples that dont have any classes

* The training phase involves observation of many examples by the classiﬁer that helps the algorithm to learn more about the learning process so that it can provide an answer in terms of a class whenever it sees an example without a class

* We can relate to what happens in a training process by imagining a child learning to distinguish trees from other objects . This is not a ﬁrst time process for a child to learn the attributes of a tree for ﬁrst time , rather when a child sees a tree it also gets to learn associated attributes which also resembles that of a tree . Gradually this becomes a process which keeps continuing from time to time , again and again whenever perception occurs using the visual faculties of the eye and processing by the brain infused with the conscious recognition of the environment . So whenever an image of a tree comes to the mind , the perception is kindled again and then one gets to adapt oneself with the picture of a tree .

* So , whenever a similar tree bearing leaves , green texture , a brown sap comes about in the mind of the child , one gets mentally attuned to the perception which also helps in recognition of other such similar objects in and around oneself . All these help a child create an idea of what a tree looks like by contrasting the display of tree features with the images of other different objects such as pieces of furniture that are made of wood but do not share other such characteristics of a tree .

* A Machine Learning Algorithm's classiﬁer works in the same process . The machine learning algorithm builds its cognitive capabilities by creating a mathematical formulation which includes all the given features in such a way that it creates a function which can distinguish one class from another .

* Being able to express such mathematical formulation , is the representation capability of a classiﬁer . From a mathematical perspective , one can express the representation process in machine learning using the concept which is called as "Mapping" . Mapping is a process which takes place when one discovers the construction of a function by observing the outputs of a function . This means that the process of mapping is a retrospective one where one has to assume that this process takes place from the determination of the output by proper consideration of the input . One can say that a successful mapping process of a machine learning process is similar to a child internalising the idea of an object where the child develops the required skills of learning from the environment and then using the knowledge acquired to distinguish the given set of objects when the need is called for .The child now after internalising the things , understands the abstract rules derived from the facts of the world in an effective manner so that when the child will see a tree , the child will immediately recognise the tree when a situation arises .

* Such a representation ( using abstract rules derived from real-world facts ) is possible because the learning algorithm has many internal parameters which constitutes of vectors and matrices of values . The dimension and type of internal parameters delimits the kind of target functions that an algorithm can learn . An optimisation engine in the algorithm changes the parameters from their initial values during the process of learning to represent the target's hidden function . I think the above paragraph is a bit complex to understand as it needs some explanatory level diagrams which could help sufﬁce the need for proper understanding of the mentioned jargons .

The construct of any applicable Machine Learning Algorithm based on any mathematical construct with employment of statistical formulations of hypothesis conjectures would be covered under a separate title under the series of Learning Process of Machine Learning algorithms

The Various Categories of Machine Learning Algorithms with their Interpretational learnings

Machine Learning has the three different ﬂavours depending on the algorithm and their objectives they serve . One can divide machine learning algorithms into three main groups based on the purpose :

01) Supervised Learning

02) Unsupervised Learning

03) Re-inforcement Learning

Now in this article we will learn more on each of the learning techniques in greater detail .

==================================

01) Supervised Learning

==================================

* Supervised Learning occurs when an algorithm learns from a given form of example data and associated target responses that consist of numeric values or string labels such as classes or tags , which can help in later prediction of correct responses when one is encountered with newer examples

* The supervised learning approach is similar to human learning under the guidance and mentorship of a teacher . This guided teaching and learning of a student under the aegis of a teacher is the basis for Supervised Learning

* In this process , a teacher provides good examples for the student to memorize and understand and then the student derives general rules from the speciﬁc examples

* One can distinguish between regression problems whose target is a numeric value and along with that one can make use of such regression problems whose target is a qualitative variable which is an indicator of a class or a tag as in the case of a selection criteria

* More on Supervised Learning Algorithms with examples would be discussed in later articles .

==================================

02) Unsupervisd Learning

==================================

* Unsupervised Learning occurs when an algorithm learns from plain examples without any associated response in the target variable , leaving it to the algorithm to determine the data patterns on their own

* This type of algorithm tends to restructure the data into something else , such as new features that may represent a class or a new series of uncorrelated values

* What is Unsupervised Learning ? It is a type of learning which tends to restructure the data into some new set of features which may represent a new class or a series of uncorrelated values within a data set

* Unsupervised Learning algorithms are quite useful in providing humans with insights into the meaning of the data as there are patterns which need to be found out

* Unsupervised Learning is quite useful in providing humans with insights into the meaning of the data and new useful inputs to supervised machine learning algorithms

* As a new kind of learning , Unsupervised Learning resembles the methods that humans use to ﬁgure out that certain objects or events are of the same class or characteristic or not , by observing the degree of similarity of the given objects

* Some of the recommendation systems that one may have come across over several retail websites or applications are in the form of marketing automation which are based on the type of learning

* The marketing automation algorithm derives its suggestions from what one has done in the past

* The recommendations are based on an estimation of what group of customers that one resembles the most and then inferring one's likely preferences based on that group

==================================

02) Reinforcement Learning

==================================

* Reinforcement Learning occurs when one would present the algorithm with examples that lack any form of labels as in the case of unsupervised learning .

* However , one can provide an example with some positive and negative feedback according to the solution of the algorithm proposed

* Reinforcement Learning is connected to the applications for which the algorithm must make decisions ( so the product is mostly prescriptive and not just descriptive as in the case of unsupervised learning ) and on top of that the decisions bear some consequences .

* In the human world , Reinforcement learning is mostly a process of learning by the application of trial and error method to the process of learning

* In this type of learning , initial errors and aftermath errors help a reader to learn because this type of learning is associated with a penalty and reward system which gets added each time whenever the following factors like cost , loss of time , regret , pain and so on get associated with the results that come in the form of output for any particular model upon which the set of reinforcement learning algorithms are applied

* One of the most interesting examples on reinforcement learning occurs when computers learn to play video games by themselves and then scaling up the ladders of various levels within the game on their own just by learning on their own the mechanism and the procedure to get through each of the level .

* The application lets the algorithm know the outcome of what sort of action would result in what type of result .

* One can come across a typical examplle of the implementation of a Reinforcement Learning program developed by Google's Deep Mind porgram which plays old Atari's videogames in a solo mode at https://www.youtube.com/watch?v=VieYniJORnk

* From the video , one can notice that the program is initially clumsy and unskilled but it steadily improves with better continuous training until the program becomes a champion at performance of the task

Descending the Right Curve in Machine Learning - A relation to science fiction and science in practice

* Machine Learning may appear as a magic trick to any newcomer to the discipline - something to expect from any application of advanced scientiﬁc discovery , as similar as Arthur C Clarke , the futurist and author of popular science ﬁction stories like 2001: A Space Odyssey. This sentence suggests that ML is largely a construct of so many things combined which has the ability to deem itself incomprehensible by the sheer magnitude of the level of machinery and engineering involved which could help a general user to ascertain models and predictions based on the patterns identiﬁed from a particular dataset

* Supporting his theory of Machine Learning , Mr Arthur C Clarke had stated in his third law stating that "any sufficiently advanced technology is indistinguishable from magic" which appeals to a common user that the when it comes to user level perception of any sufficiently high level technology , then to a common user the technology seems some form of magic . Since in magic , the trick is to carry off a spectacle without letting the viewer of the trick to get to know the underlying working principle involved in the magic

* Though it is greatly believed that Machine Learning's underlying strength is some form of imperceptible mathematical , statistical and coding based magic , however , this is not a form of magic but rather one needs to understand the underlying foundational concepts from the scratch so that so of the more complex working mechanism could be understood . Therefore , it is said that machine learning is is the application of mathematical formulations to have a r great learning experience

* Expecting that the world itself is a representation of mathematical and statistical formulations , machine learning algorithms strive to learn about such formulations by tracking them back from a limited number of observations .

* Just as humans have the power of distinction and perception , and can recognise what is a ball and which one is a tree , machine learning algorithms can also use the computational power of the computers to deploy the widely available data on all the subjects and domains , human beings can use the computational power of computers and leverage their wide availability to learn how to solve a large number of important and useful problems

* It is being said that though Machine Learning is a complex subject , humans devised this and in its initial inception , Machine Learning started mimicking the way in which one can learn from the surrounding world . One can also on top of that express simple data problems and basic learning algorithms based on how a child would perceive and understand the problems of the world or to solve a challenging learning problem by using the analogy of descending from the top of the mountains by taking the right slope of descent .

* Now with a somewhat better understanding of the capabilities of machine learning and how they can help in the direction of solving a problem , one can now start to learn the more complex facets of the technology in greaer detail with more examples of their proper usages .