Tools used for analyzing algorithms
Writing pseudo code
- NO machine dependent language.
- NO high-level or low-level programming constructs. NO pointers.
- Half open intervals for array ranges (Wrong … Take closed intervals)
- Arrays range from A[1 to N] or A[1..N]. There is no need to use 0-based indexing.
- We access array elements by specifying the array name followed by the index in square brackets. For example, A[i] indicates the ith element of the array A. The notation “...” is used to indicate a range of values within an array. Thus, A[i..j] indicates the subarray of A consisting of the j elements A[i], A[i+1] .. A[j].
- Assumption that size/length are available within the object.
- Single char variable for loops is good.
- Function names should indicate action performed.
- Error handling can be ignored.
Half open intervals during coding
- A half-open range is one which includes the first element, but excludes the last one.
- (a,b] (includes b, excludes a)
- [a,b) (includes a, excludes b)
- The range [1,5) is half-open, and consists of the values 1, 2, 3 and 4.
- [ includes the endpoint and ( excludes the endpoint
- Right half-open intervals are convenient because they have the following desirable properties:
- The interval [a,b) contains b-a elements exactly (rather than b-a+1 or b-a-1).
- The interval of length n starting at i can simply be denoted [i,i+n), rather than the more cumbersome [i,i+n-1].
- The disjoint union of [i,j) and [j,k) is [i,k); and likewise any right half-open interval [a,b) may be divided by intermediate values a < c[0] < c[1] < c[2] ... < c[n] < b into a disjoint union of right half-open intervals [a,c[0]) UNION [c[0],c[1]) UNION [c[1],c[2]) UNION ... UNION[c[n], b); i.e., endpoints telescope nicely with half-open intervals, whereas they do not with fully open or fully closed intervals.
Dijkstra's note on why we should use half open intervals5
Off-by-one errors (Fencepost problem)
An off-by-one error (OBOE), also commonly known as an OBOB (off-by-one bug) or "that extra inch you didn't really want", is a logic error involving the discrete equivalent of a boundary condition.
Loop Invariance
Given a pseudo code or implementation, how would you argue that the code is correct?
How do you show that for any input your code provides the right output?
Loop Invariant helps you state and prove it formally that the desired property is maintained in your loop.
Snippet from Thomas Cormen's answer 2
A loop invariant is a predicate (a statement that is either true or false) with the following properties:
- It is true upon entering the loop the first time.
- If it is true upon starting an iteration of the loop, it remains true upon starting the next iteration.
- The loop terminates, and the loop invariant plus the reason that the loop terminates gives you the property that you want.
Snippet from Stack overflow 3
In simple words, a loop invariant is some predicate (condition) that holds for every iteration of the loop. For example, let's look at a simple for loop that looks like this:
int j = 9; for (int i=0; i
In this example it is true (for every iteration) that i + j == 9. A weaker invariant that is also true is that i >= 0 && i < 10 (because this is the continuation condition!) or that j <= 9 && j >= 0.
Proofs are broken down into 3 steps:
Initialization: Loop Invariant is true before the loop runs.
Maintenance: Loop Invariant is true before an iteration of a loop and it remains true before the next iteration
Termination: Loop Invariant will terminate in a useful (producing desired output) once the loop is finished.
NOTE: Loop Invariance talks about loops. If there are multiple loops, there are multiple Loop Invariants.
Exercise: Red and Blue Marbles in a Jar1
Suppose you have a jar of one or more marbles, each of which is either RED or BLUE in color. You also have an unlimited supply of RED marbles off to the side. You then execute the following "procedure":
while (number of marbles in the jar > 1) {
choose (any) two marbles from the jar;
if (the two marbles are of the same color) {
toss them aside;
place a RED marble into the jar;
}
else { // one marble of each color was chosen
toss the chosen RED marble aside;
place the chosen BLUE marble back into the jar;
}
}
What does the loop do? What is the Loop Invariant here?
At each iteration, we have one less marble within the jar.
If you are stuck and don't know what the loop does and there are multiple ways/combinations for how the loop could branch out, use decision tree for describing the loop and try to infer the loop invariant from there
Exercise: Finding a maximum element in an Array
Suppose we have an array a[0..n-1] (where n>0) of, say, integers, and we want to determine the maximum among the values in a[]. Using maxVal as the output variable, the desired postcondition is
maxVal == Max({a[0], a[1], ... , a[n-1]})
What do we want the algorithm to do? What should be the Loop Invariant here?
Using the approach described above, we replace "constant" n in the postcondition by variable k in order to obtain
P: maxVal == Max(a[0..k-1])
This leads us to propose the following program skeleton:
k = 1;
maxVal = a[0];
// {loop invariant P: maxVal == Max(a[0..k-1])}
while (k != n) {
if (a[k] > maxVal)
{ maxVal = a[k]; }
// {P(k:=k+1): maxVal == Max(a[0..k])}
k = k+1; // re-establishes P
}
// { maxVal == Max(a[0..k-1]) & k == n }
// {postcondition: maxVal == Max(a[0..n-1]) }
Having set k to 1, to establish P(k:=1) requires simply that maxVal be initialized to the maximum of a[0..0], which is obviously a[0] itself.
As for establishing P(k:=k+1) inside the loop before incrementing k, consider that, at this point, we have P (i.e., maxVal == Max(a[0..k-1])) and that we want to change maxVal to establish maxVal == Max(a[0..k]).
How are Max(a[0..k-1]) and Max(a[0..k]) related? Given that a[0..k-1] and a[0..k] are the same, except for the latter containing one extra element, ---namely, a[k]--- clearly the only difference is this: if a[k] is bigger than the maximum element in a[0..k-1], it is the maximum of a[0..k]; otherwise, the maximum of the two array segments is the same.
Analysis
In general, the time taken by an algorithm grows by the size of the input, so it is traditional to describe the running time of the program as a function of the size of its input.
Running time of an algorithm is the number of steps/operations performed. Its supposed to be machine independent.
For example:
Now that we have calculated the running time, its important to get the order of growth.
Small O notation (Lower bound)
Tight bound (Theta notation)
Big O notation (Upper bound/Worst case)
Based on the above example the worst case all the lines of code are executed for maximum number of times. Because its a geometric series of sum of all numbers, we have a quadratic time upper bound.
Average case
Is this same as Tight bound?
Average case refers to what happens if we change the order of elements - on an average what is the running time complexity? The scope of average running times is difficult to answer because it is not apparent what constitutes an "average" input for a given problem. Often we shall assume that all inputs of a given size are equally likely.
Randomized algorithms make random choices - which allows for probabilistic choices and yield an expected running time. ???
Expected value/Expectation/Mean-Value/First-Moment
In probability theory, the expected value of a random variable, intuitively, is the long run average value of repetitions of the experiment it represents. 4
Practically, the expected value of a discrete random variable is the probability weighted average of all possible values. The expected value is a key aspect of how one characterizes a probability distribution; it is one type of location parameter.
Expected value = Possible value of the random variable x
probability of that value occuring
The expected value does not exist for random variables having some distributions with large "tails", such as the Cauchy distribution. For random variables such as these, the long-tails of the distribution prevent the sum/integral from converging.
By contrast, the variance is a measure of dispersion of the possible values of the random variable around the expected value.
Variance = expected value of the squared deviation of the variable's value
from the variable's expected value.
In statistical modeling, regression analysis is a statistical process for estimating the relationships among variables. More specifically, regression analysis helps one understand how the typical value of the dependent variable (or 'criterion variable') changes when any one of the independent variables is varied, while the other independent variables are held fixed. In all cases, the estimation target is a function of the independent variables called the regression function.
In regression analysis, it is also of interest to characterize the variation of the dependent variable around the regression function which can be described by a probability distribution.
Regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning.Many techniques for carrying out regression analysis have been developed. Familiar methods such as linear regression and ordinary least squares regression are parametric, in that the regression function is defined in terms of a finite number of unknown parameters that are estimated from the data.
Nonparametric regression refers to techniques that allow the regression function to lie in a specified set of functions) , which may be infinite-dimensionalIn linear regression, the model specification is that the dependent variable,
is a linear combination of the parameters (but need not be linear in the independent variables). For example, in simple linear regression for modeling
data points there is one independent variable:
, and two parameters,
and
:
Properties of Expectation
- E.V is constant.
- E.V is monotonic ie if X < Y then E[X] < E[Y]
- E.V expresses linearity
- E[X+c] = E[X] +c
- E[X+Y] = E[X] + E[Y]
- E[aX] = aE[X]
- E.V is non-multiplicative
- Cov(X,Y) = E[XY] - E[X]E[Y]
- if "Covariance" Cov(X,Y) is 0, then they're independent and uncorrelated.
Uses of Expectation
- E.V = P(Event) if E(I(Event has occurred atmost once))
Indicator function
The Indicator function of an event is a random variable which takes:
- value 1 if event happens
- value 0 if event does not happen
Properties of Indicator function
Things to think about
- How can you modify any algorithm to have a good best-case running time?
A: Adding special cases like if input already sorted return array.
1. Loop Invariant ↩
2. Thomas Cormen's Answer ↩
3. Stackoverflow What is loop invariant ↩
4. Expected value probability theory ↩