Posted on 16 December 2021, by Kevin Mora
The following are my notes – based on the book "Algorithms 4th Edition, by Robert Sedgewick and Kevin Wayne."
The structure of writing a good program in computer science is similar to the structure of writing a proof in mathematics... I think, "Computers are like telescopes through which we could better investigate the computational nature of reality."
An algorithm is a method for doing something on a computer; a data structure is a layout for memory that represents some sort of data. You can't really write anything without both, so asking which one is "compulsory" is just silly: they're both vital. Moreover, you can't really split the two apart. Data structures are only interesting because of the algorithms they enable: when we say a hash table has O(1) insertion, what we really mean is that there is an O(1) algorithm to insert something into a hash table. On the flip side, most algorithms are inextricably tied to some data structure.
A data structure is a structure that stores some form of data in it. The goal of a data structure is to store the data in a form that it’s easy for the programmer to read or update it. An algorithm is a set of steps taken to achieve a particular goal, this goal can be anything (not specifically related to programming). Algorithm is just a fancy technical name given to the above definition. A good algorithm is one which uses some data structure that suit its need to make sure that those sequence of steps are achieved in the least amount of time, using the least amount of resources.
This is the scientific approach to designing and analyzing algorithms, where we build mathematical models to try and understand what's going on, and then we do experiments to validate those models and help us improve things.
A Monte Carlo Simulation is a way of approximating the value of a function where calculating the actual value is difficult or impossible. It uses random sampling to define constraints on the value and then makes a sort of "best guess." A simple Monte Carlo Simulation can be used to calculate the value for π.
If you had a circle and a square where the length of a side of the square was the same as the diameter of the circle, the ratio of the area of the circle to the area of the square would be π/4. So, if you put this circle inside the square and select many random points inside the square, the number of points inside the circle divided by the number of points inside the square and the circle would be approximately π/4.
Given a set of n objects (their location doesn't really matter), we're going to use numbers to model them – zero through n. And then, we have the idea of a connection between two objects. Given two objects, provide a connection between them... The key part of the problem is to find a query or the connected query, which just asks, is there a path connecting the two objects? So that's our problem: intermix unions, commands and connected queries, where we need to be able to officially support those commands for a large number of objects.
CONNECTION DESCRIPTION connected(8, 9) even if we don't have a direct connection between eight and nine, there is a path from 8 to 3 to 4 to 9 union(5, 0) direct connection union(7, 2) direct connection union(6, 1) direct connection union(0, 1) redundant connection, but still a connection connected(0, 7) since there's a path, they are connected
We assume "is connected to" is an equivalence relation:
Efficient data structure for union-find:
public class UF
UF(int N) // initialize union-find with N objects (0 to N-1)
void union(int p, int q) // add connection between p and q
boolean connected(int p, int q) // are p and q in the same component?
int find(int p) // component identifier for p(0 to N-1)
int count() // number of components
Interpretation: p and q are connected iff ("iff" stands for "if and only if") they have the same id. Each entry in the array contains a number that corresponds to one of the connected components.
Integer array id[] of size N.
public class QuickFindUF {
private int[] id;
public QuickFindUF(int N) {
id = new int[N]; // set id of each object to itself
for(int i = 0; i < N; i++) { // (N arrays accesses)
id[i] = i;
}
}
public boolean connected(int p, int q) {
return id[p] == id[q]; // check whether q and q are in the
} // same component (2 array accesses)
public void union(int p, int q) {
int pid = id[p];
int qid = id[q];
for(int i = 0; i < id.length; i++) { // check all entries with id[p] and id[q]
if (id[i] == pid) id[i] = qid; // at most 2N +2 array accesses
}
}
}
// When using the QuickFindUF algorithm is very fast and efficient to
// check whether 2 objects are connected. It is a bit more complicated to
// connect two objects.
In particular if you just have n union commands on n objects which is not unreasonable. They're either connected or not, which will take quadratic time in squared time. Quadratic time is much too slow, and we can't therefore accept quadratic time algorithms for large problems.
The reason is they don't scale. As computers get faster and bigger, quadratic algorithms actually get slower. Now, let's just talk roughly about what I mean by that. A very rough standard, say for now, is that people have computers that can run billions of operations per second, and they have billions of entries in main memory. That means that you could touch everything in the main memory in about a second. Now it's true when computers only have a few thousand words of memory and it's true now that they have billions, or more.
With huge memory, we can address huge problems. So we could have billions of objects, and hope to do billions of union commands on them, but the problem with that quick find algorithm is that, that would take 10^18 operations, or, say array axises or touching memory. Quadratic time algorithms are too slow, and they don't scale. You might have a new computer that's ten times as fast, and thus you could address a problem that's ten times as big, but with a quadratic algorithm the process is going to be ten times as slow...
So quick-find is too slow for huge problems. So, how are we going to do better? Our first attempt is an alternative called "Quick-Union." With this algorithm we try to avoid doing work until we have to – it uses the same data structure or array id with size m but now it has a different interpretation. To merge components containing different items (e.g. two items that are in different components) all we do is set the id of p's route to the id of q's route.
--- 0 1 2 3 4 5 6 7 8 9
id[] 0 1 1 8 3 5 5 7 8 8
// Based on the array above determine how many elements
// are in the same connected component as 8?
// - Answer: 4.
//
// Set {3,8,9} contain component 8, but set{4} contains 3,
// and 3 contains component 8; therefore 3 & 4 are connected.
// - Final set is: {3,4,8,9}
// PROBLEM #1
Executing the instruction union(3, 1) will result in the updated
of one single array element from the diagram above.
Calling union(3, 1) changes the value on index: 8.
The new value on that index is: 1.
// PROBLEM #2
Executing the instruction union(5, 9) will result in the updated
of one single array element from the diagram above.
Calling union(5, 9) changes the value on index: 0.
The new value on that index is: 8.
public class QuickUnionUF {
private int[] id;
public QuickUnionUF(int N){
id = new int[N];
for(int i = 0; i < N; i++) { // set id of each object to
id[i] = i; // itself (N array accesses)
}
}
public int root(int i) {
while(i != id[i]) i = id[i]; // chase parent pointers until reach root
return i; // (depth of i array accesses)
}
public boolean connected(int p, int q){ // check if p and q have same root
return root(p) == root(q); // (depth of p and q array accesses)
}
public void union(int p, int q){
int i = root(p); // change root of p to point to root of q
int j = root(q); // (depth of p and q array accesses)
id[i] = j;
}
}
Quick-Union is faster than Quick-Find, but it's also too slow. The defect for Quick-Union is that the trees can get too tall, which would mean that the find operation would be too expensive – essentially, is going to be too slow if you have a lot of operations.
The weighted Quick-Union algorithm makes sure that the smaller tree gets linked to the root of the larger tree.
Image above represents the quick-union algorithm and the weighted quick-union algorithm (100 sites, 88 union() operations).
We can analyze the running time mathematically and show that defined operation, whose time is proportional to how far down the nodes are in the tree (takes time proportional to depth of p). Union takes constant time, given roots. The nodes are in the tree, and we can show that it's guaranteed that the depth of any node in the tree is at most the logarithm to the base two of n.
When using the weighted Quick-Union algorithm the depth of any node x in the tree is at most lg N ("lg" stands for "base-2 logarithm").
Pf. What causes the depth of object x to increase?
ALGORITHM INITIALIZE UNION CONNECTED quick-find N N 1 quick-union N N N weighted QU N lg N lg N
public int find(int i) {
while (i != id[i]) {
id[i] = id[id[i]]; // only one extra line of code!
i = id[i];
}
return i;
}
// Literally no reason not to.
// Keeps tree almost completely flat.
Is there any algorithm simpler than a linear algorithm? People looked for a long time for that, and actually it works out to be the case that we can prove that there is no such algorithm. There's a lot of theory that goes behind the algorithms that we use, and it's important for us to know that theory and that will help us decide how to choose which algorithms we're going to use in practice, and where to concentrate our effort in trying to find better algorithms. It was eventually proved by Fredman and Saks, that "... there is no linear time algorithm for the union find problem." In other words, no linear-time algorithm exists.
Proposition [Hopcroft-Ulman, Tarjan]:
// TODO ALGORITHM WORST-CASE TIME quick-find M N quick-union M N weighted QU N + M log N QU + path compression N + M log N weighted QU + path compression N + M lg* N
Weighted quick union with path compression in practice is close enough that it's going to enable the solution of huge problems. We can solve problems that could not be otherwise addressed. For example, if you have a billion operations/unions (10^9) and a billion (10^9) objects, it said before it might take thirty years to complete the computation. With this algorithm we can do it in six seconds.
Now, and what's most important to recognize about this is that it's the algorithm design that enables the solution to the problem. A faster computer wouldn't help much. You could spend millions on a super computer, and maybe you could get it done in six years instead of 30, or in two months but with a fast logarithm, you can do it in seconds, with your own PC.
Write a program to estimate the value of the percolation threshold via Monte Carlo simulation.
A system percolates if you can find a way to get from the top to the bottom through white squares. This is an example of a mathematical model where the problem is very well articulated. What's that threshold value but, nobody knows the solution to that mathematical problem. The only solution we have comes from a computational model, where we run simulations to try and determine the value of that probability. Those simulations are only enabled by fast union-find algorithms – that's our motivating example for why we might need fast union find algorithms, so let's look at that. So what we're going to run is called a so called Monte Carlo simulation, where we initialize the whole grid to be blocked (i.e., filled with black sites) and then we randomly fill in open sites.
Clever trick:
What we want to do is connect, check whether any site in the bottom row is connected to any site in the top row, and use union find for that. The problem is that it would be a brute force algorithm – would be quadratic, because it would have n^2 calls to find, and check whether they're connected. For each site on the top, I'd check each site on the bottom, but still too slow. Instead, what we do is create a virtual site on the top and on the bottom. Then, when we want to know whether this system percolates, we just check whether the virtual top site is connected to the virtual bottom site.
So how do we model opening a new site? Well to open a site we just connect it to all its adjacent open sites. So that's a few calls to union but that's easy to implement. And then with that simple relationship we can use exactly the code that we developed to go ahead and run a simulation for this connectivity problem. That's where we get the result that, by running enough simulations for a big-enough n, that this, percolation threshold is about.
So, what is percolation threshold p* ? About 0.592746 for large square lattices – a constant known only via simulation. Fast algorithms enable accurate answers to scientific question. If we use a slow union-find algorithm we won't be able to run it for very big problems, and we won't get a very accurate answer.
The idea of understanding the running time of a computation goes way back – probably even before Babbage.
Babbage said, "... as soon as an analytical engine exists, it will necessarily guide the future course of the science. Whenever any result is sought by its aid, the question will arise by what course of calculation can these results be arrived at by the machine in the shortest time." If you look at Babbage's machine called the analytic engine, it's got a crank on it. And literally the concern that Babbage had in knowing how long a computation would take is, how many times do we have to turn the crank. It's not that different in today's world: the crank may be something electronic that's happening a billion times a second. But still, we're looking for how many times does some discreet operation have to be performed in order to get a computation done.
The practical reason that we want to be analyzing algorithms and understanding algorithms is to avoid performance bugs. There's a long list of success stories in designing algorithm with better performance – enabling thus a solution to problems that wouldn't have been solved otherwise.
The future of the natural world that we're studying is some particular computer that exists in the natural world. It changes the algorithm from an abstraction to some kind of actual physical thing happening, like electrons racing around inside the computer.
For many problems, the running time can vary widely depending on the input.
Proposition 0: in the linked-list implementation of Bag, Stack,
and Queue, all operations take constant time in the worst case.
Proposition 1: in the resizing-array implementation of Bag, Stack, and Queue,
starting from an empty data structure, any sequence of <i>N</i> operations takes time
proportional to <i>N</i> in the worst case (amortized constant time per operation).
A Turing Complete system means a system in which a program can be written that will find an answer (although with no guarantees regarding runtime or memory).
Alan Turing created a machine that can take a program, run that program, and show some result. But then he had to create different machines for different programs. So he created "Universal Turing Machine" that can take any program and run it. Programming languages are similar to those machines (although virtual). They take programs and run them. Now, a programing language is called "Turing complete," if it can run any program (irrespective of the language) that a Turing machine can run given enough time and memory.
For example: Let's say there is a program that takes 10 numbers and adds them. A Turing machine can easily run this program. But now imagine that for some reason your programming language can't perform the same addition. This would make it "Turing incomplete" (so to speak). On the other hand, if it can run any program that the universal Turing machine can run, then it's Turing complete.
Most modern programming languages (e.g. Java, JavaScript, Perl, etc.) are all Turing complete because they each implement all the features required to run programs like addition, multiplication, if-else conditions, return statements, ways to store/retrieve/erase data and so on... Turing and other early computer scientists would build a specific machine each time they wanted to solve a specific problem. We’re used to one machine that can be forever reprogrammed.
I wrote about Turing Completeness here.
Given N distinct integers, how many triples sum to exactly zero?
// Triples Sum Brute-force Algorithm
public class ThreeSum {
public static int count(int[] a) {
int N = a.length;
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
for (int k = j+1; k < N; k++)
if (a[i] + a[j] + a[k] == 0)
count++;
return count;
}
public static void main(String[] args) {
In in = new In(args[0]);
int[] a = in.readAllInts();
StdOut.println(count(a));
}
}
public class Stopwatch // (part of: stdlib.jar)
Stopwatch() // create a new stopwatch
double elapsedTime() // time since creation (in seconds)
// Implementation:
public static void main(String[] args) {
In in = new In(args[0]);
int[] a = in.readAllInts();
Stopwatch stopwatch = new Stopwatch(); // 1
StdOut.println(ThreeSum.count(a));
double time = stopwatch.elapsedTime(); // 2
StdOut.println("elapsed time " + time); // 3
}
If this were some scientific problem where we were counting something that happen in the natural world. The number of ants in an ant hill or whatever, then we'd have only a few data points and we would try to understand whats was going on by doing a plot of or running time with quite interested in on the y-axis and problem size with the x-axis. What scientists usually do because many problems fall into this class is do the plot as a lg, lg plot.
If you do it as a lg, lg plot very often you'll get a straight line. And the slope of the straight line is the key to what's going on. In this case, the slope of the straight line is three and so you can run what's called a regression to fit a late, the straight line through the data points. And then, it's not difficult to show to do the math to show that if you get a straight line and the slope is b, then your function is proportional to a N^b.
That's called the power law. And that's true of many, many scientific problems including most algorithms. So here's a little bit of the math for that: the straight line means that since we did a lg, lg plot with powers of two, that lg(T(N) = b lg N + c. And we have our empirical values of b and c and then if you raise both sides of that equation to two to that power then you get t(n) = a constant times n^b. So right away just from observation we have a pretty good model for the running time for our program, we can figure and do the math and figure out that it seems as though the running time is about ten^-10 n^3 seconds. We can use that hypothesis to go ahead and make predictions. Just plug in for different values of n and it says it will take us 400 seconds for 16,000 in our triple sum brute-force algorithm. 400 seconds is plenty of time but now we can go ahead and invest and run that experiment and sure enough we're pretty close to that 408 seconds when we run it. And now we can make a prediction for 32,000 or for whatever else we might be interested in. The model helps us do predictions without investing the expense to run the experiments. In fact, in this situation if there is a power law, and again in a very great majority of computer algorithm running times is going to be a power law.
System independent effects (determines exponent in power law):
Bad news: is difficult to get precise measurements. Good news: is much easier and cheaper than other sciences.
In the 60s computer systems were really becoming complicated for the first time, and computer scientists were concerned about whether we really were going to be able to understand what's going on. Knuth was very direct in saying that this is something that we certainly can do – we can calculate the total running time of a program by identifying all the basic operations, figuring out the cost, figuring out the frequency of execution and summing up the cost times frequency for all the operations.vYou have to analyze the program to determine what set of operations and the cost depends on the machine and the computer in the system is what we talked about before. The frequency leads us to mathematics because it depends on the algorithm and input data. From Knuth, we know that in principle, we can get accurate mathematical models for the performance of algorithms or programs and operations.
If you wanted to run experiments in the 60s or so, you would actually have to look at the computer's manual (every computer came with a manual that said precisely how long each instruction would take). But nowadays, it's a little more complicated, so there's some way to determine the costs of the basic operations.
How many instructions as a function of input size N?
“It is convenient to have a measure of the amount of work involved in a computing process, even though it be a very crude one. We may count up the number of times that various elementary operations are applied in the whole process and then given them various weights. We might, for instance, count the number of additions, subtractions, multiplications, divisions, recording of numbers, and extractions of figures from tables. In the case of computing with matrices most of the work consists of multiplications and writing down numbers, and we shall therefore only attempt to count the number of multiplications and recordings.” — Alan Turing
Turing knew that we want to have a measure of the amount of work involved in the process. He recognized that you didn't want to necessarily go through and do it in full detail – it's still helpful to have a crude estimate. So, you could count up the number of times that every operation is applied, give it weights, and count the that and so forth. But maybe we should just count the ones that are most expensive that's what Turing said in 1947, and realistically that's what we do nowadays. So rather than going in and counting every little detail, we take some basic operation that's maybe the most expensive and or the one that's executed the most often – the one whose cost and frequency is the highest, and use that as a proxy for running time.
Essentially, making the hypothesis that the running time is going to grow like a constant times, so in this case we're going to pick array axises. So, that's the first simplification, and the second simplification is that we're going to ignore low order terms in the formulas we derive. There's an easy way to do that: it's called the tilde notation, and the idea is when n is large in a formula like this, the n^3 term is much, much higher than the n term or sixteen. Therefore, we use cost model and tilde notation to simplify counts. Count some basic operation as a proxy for the running time is the best way to describe the cost model.
In principle, accurate mathematical models are available. In practice, formulas can be complicated, advanced mathematics might be required, and exact models best left for experts. For all the algorithms we'll try to communicate a reasonable approximate model that can be used to describe the running time.
Approximate models in this course: T(N) ~ c N^3
When we analyze algorithms, actually not too many different functions arise, and actually that property allows us to really classify algorithms according to their performance as the problem size grows. Good news is there's only these few functions turn up about the algorithms that we are interested in. We can craft things that have other functions and there are counter examples to this. But really a great number of the algorithms that we consider are described by these few functions and that are plotted here. And when we are talking about the order of growth, we are not talking about the leading constant. Normally we say the running time of the algorithm is proportional to n log n.
The set of functions: 1, log N, N, N log N, N^2 , N^3 , and 2^N suffices to describe the order of growth of most common algorithms.
If: f(N) ~ c g(N) for some constant c > 0, then the order of growth of f(N) is g(N).
// The order of growth of the running time of this code is N^3:
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
for (int k = j+1; k < N; k++)
if (a[i] + a[j] + a[k] == 0)
count++;
Order-of-Growth classifications need linear or linearithmic algorithms to keep pace with Moore's law – used to solve huge problems.
"Three orders of magnitude in machine speed and three orders of magnitude in algorithmic speed add up to six orders of magnitude in solving power. A model that might have taken a year to solve 10 years ago can now solve in less than 30 seconds."
public static int binarySearch(int[] a, int key) {
int lo = 0, hi = a.length-1;
while (lo <= hi) {
int mid = lo + (hi - lo) / 2;
if (key < a[mid]) hi = mid - 1; // one "3-way compare"
else if (key > a[mid]) lo = mid + 1; // one "3-way compare"
else return mid;
}
return -1;
}
// Invariant: if key appears in the array a[], then a[lo] ≤ key ≤ a[hi].
Pf sketch: assume N is a power of 2.
T(N) ≤ T(N / 2) + 1 [ given ]
≤ T (N / 4) + 1 + 1 [ apply recurrence to first term ]
≤ T (N / 8) + 1 + 1 + 1 [ apply recurrence to first term ]
⋮
≤ T (N / N) + 1 + 1 + … + 1 [ stop applying, T(1) = 1 ]
= 1 + lg N
The sorting-based N^2 log N algorithm for 3-SUM is significantly faster in practice than the brute-force N^3 algorithm.
Guiding principle: typically, better order of growth ⇒ faster in practice.
Computer scientists think of a million bits as 2^20, and a billion as 2^30 because that's a number of possible things that you can fit into 30 bits and everything is consistent with their calculations. Just in recent years we've mostly switched to a model where machines are 64-bits and pointers are eight bytes. That allows us to address much more memory, but pointers use much more space and actually this transition caused a lot of problems initially because programs were using way more space than people thought they should. We will not have to go through this kind of transition because 64 bits is definitely enough to address anything that we might need to address, 2^64 is really a huge number.
64-bit machine: we assume a 64-bit machine with 8-byte pointers.
A Java Virtual Machine (JVM) is free to store data any way it pleases internally, big or little endian, with any amount of padding or overhead, though primitives must behave as if they had the official sizes. Example: the JVM or native compiler might decide to store a boolean[] in 64-bit long chunks like a BitSet. It does not have to tell you, so long as the program gives the same answers.
Then of course the hardware and OS have multilayer caches, on chip-cache, SRAM cache, DRAM cache, ordinary RAM working set and backing store on disk. Your data may be duplicated at every cache level. All this complexity means you can only very roughly predict RAM consumption.
Integer:
Long:
String:
Multidimensional Arrays:
e.g.,:
a int[128][2] instance takes 3,600 bytes. Compared to
the 1,040 bytes an int[256] instance uses (which has the same capacity),
3,600 bytes represent a 246 percent overhead. In the extreme case of
byte[256][1], the overhead factor is almost 19! Compare that to the C/C++
situation in which the same syntax does not add any storage overhead.
Some garbage-collection schemes may impose a minimum object size which is separate from padding. During garbage collection, once an object gets copied from an old location to a new one, the old location may not need to hold the data for the object any more, but it will need to hold a reference to the new location; it may also need to store a reference to the old location of the object wherein the first reference was discovered and the offset of that reference within the old object (since the old object may still contain references that haven't yet been processed).
Typical Memory Usage for Objects in Java.
// A Date object uses 32 bytes of memory.
public class Date { // 16 bytes (object overhead)
private int day; // 4 bytes (int)
private int month; // 4 bytes (int)
private int year; // 4 bytes (int)
... // 4 bytes (padding)
} // --------------------------
// 32 bytes
Total memory usage for a data type value:
Shallow memory usage: don't count referenced objects. Deep memory usage: if array entry or instance variable is a reference, count memory (recursively) for referenced object.
Depending on context, we may or may not count the memory references by an object (recursively). For example, we count the memory for the char[] array in the memory for a String object because this memory is allocated when the string is created. But, we would not ordinarily count the memory for the String objects in a StackOfStrings object because the String objects are created by the client.
Recommended Lecture: Analysis of Algorithms.
In a nutshell, Big O is used to classify algorithms based on their scalability, and estimate the worst case running time as a function of the input size. Allows us to understand how the resource requirements of a program / algorithm scale. Remember, complexity affects performance, but not the other way around.
private static int sum (int[] numbers){
int sum = 0;
for(int i = 0; i < numbers.length; i++){
sum = sum + numbers[i];
}
return sum;
}
We need to predict which method takes more work and thus more time for sufficiently large input sizes. For small input sizes, we can’t predict whether one method outperforms another solely based on the fact that one requires constant time and the other linear time.
Expression | Tilde Notation |
---|---|
14 N^3 - 4 N^2 + 7 | ~ 14N^3 |
N^2 + 345 | ~ N^2 |
0.5 N^3 | ~ 0.5 N^3 |
0.5 N^3 + 999999 N^2 + 3000N | ~ 0.5 N^3 |
123 N^2 + 42N -13 | ~ 123 N^2 |
40 N^2 | ~ 40 N^2 |
Assume you have a method that requires 19 steps to execute.
What is the corresponding Big-O?
// O(1)
Assume you have a method that requires 13n + 7 steps to execute.
What is the corresponding Big-O?
// O(N)
O(1) –> constant –> no N
O(N) -> theres 1N –> Linear Time
O(N^2) –> Quadratic Time
O(log N) –> Logarithmic
O(N log N) –> Linearithmic
// Binary search is N log N
// Double or half is a log in terms of algortims
// Linear search O(N)
Problem A is linear (N + N/2 + N/4 + ...).
int sum = 0;
for (int n = N; n > 0; n /= 2)
for (int i = 0; i < n; i++)
sum++;
Problem B is linear (1 + 2 + 4 + 8 + ...).
int sum = 0;
for (int i = 1; i < N; i *= 2)
for(int j = 0; j < i; j++)
sum++;
Problem C is linearithmic (the outer loop loops lg N times).
int sum = 0;
for (int i = 1; i < N; i *= 2)
for (int j = 0; j < N; j++)
sum++;
int x = 0;
while (x < n) {
for(int i = 0; i < n; i++) {
a[i]++;
}
x++;
}
// O(N^2)
for(int i = 1; i < n; i *= 2) {
StdOut.print(i + " ");
}
StdOut.println();
// O(log N)
for(int i = n; i > 0; i /= 2) {
StdOut.print("* ");
}
StdOut.println();
// O(log N)
int x = n * n;
// O(1)
int count = 0;
for(int i = n; i > 0; i--) {
for(int j = 1; j < n; j * 2) {
count++;
}
}
// O(N log N)
// a1 and a2 are two integer arrays of size n
for (int i = 0; i < n; i++) {
int index = BinarySearch.indexOf(a, b[i]);
if (index >= 0) {
System.out.println(b[i] + " was found on index " + index);
}
}
// O(N log N)
for(int i = 1; i < n; i *= 2) {
for(int j = 3; j < 8; j++) {
a[i] += StdRandom.uniform(j);
}
}
// O(log N)
int x = 0;
while (x < n) {
for(int i = 0; i < x; i++) {
System.out.print("|.");
}
System.out.println();
x++;
}
// O(N^2)
for(int i = n-1; i >= 0; i /= 2) {
if (i % 2 == 0)
a[i]++;
else
a[i]--;
}
// O(log N)
if (n < 10)
System.out.println("0" + n);
else if (n < 100)
System.out.println(n);
// O(1)
public static void printAll(int[] a) {
int N = a.length;
for (int i = 0; i < N; i++) {
for (int j = i+1; j < N; j++) {
for (int k = j; k < 100; k++) {
if (a[i] + a[j] + a[k] == 0) {
StdOut.println(a[i] + " " + a[j] + " " + a[k]);
}
}
}
}
}
// O(N^2)
public int compareTo(Point2D that) {
if (this.y < that.y) return -1;
if (this.y > that.y) return +1;
if (this.x < that.x) return -1;
if (this.x > that.x) return +1;
return 0;
}
// O(1)
private void sink(int k) {
while (2*k <= N) {
int j = 2*k;
if (j < N && greater(j, j+1)) j++;
if (!greater(k, j)) break;
exch(k, j);
k = j;
}
}
// O(log N)
public static void printAll(int[] a) {
int N = a.length;
Arrays.sort(a);
for (int i = 0; i < N; i++) {
for (int j = i+1; j < N; j++) {
int k = Arrays.binarySearch(a, -(a[i] + a[j]));
if (k > j) StdOut.println(a[i] + " "+ a[j] + " "+ a[k]);
}
}
}
// O(N^2 log N)
public String toString() {
String s = "";
for (int i = 0; i < N; i++)
s = s + data[i] + " ";
return s;
}
// O(N)
A List is a data structure represented as an ordered collection of values (a.k.a. items, entries, elements), where a value may occur more than once. Lists can add/remove/find elements, as well as many other operations like concatenation.
Inserting an element at a specific location of an array is a fairly expensive operation, and their sizes are fixed when you create them. If you allocate too much space and utilize too little is a waste of memory.
Linked List is a sequence of nodes such that each node contains a reference or a link to the next node in the list. Every element in the list (known as a node) has two parts – one contains data/info (e.g. String, int, Object), and the other part contains the reference (known as next, contains some memory address pointing to the next node in the list).
Determine the Big-O for ArrayList operations Operation Cost as O( ) read (anywhere in the array) O(1) add/remove (at the logical end of the array) O(1) add/remove (in the interior of the array) O(N) resize O(N) find by position O(1) find by target (element) O(N)
It can feel counterintuitive that adding/removing in
the interior of the array has a linear order of growth.
That's because the remaining elements need to be shifted.
//
Resizing an array requires creating a new array and copying
the elements over. That has a linear order of growth.
//
Finding a target also takes linear order of growth. Since
there is no indication that the array is sorted, we need
to look at each element until we find it or determined
that it is not there.
Determine the Big-O for LinkedList operations Operation Slingly Linked List Doubly Linked List read (anywhere in the linked list) O(N) O(N) add/remove (at the head) O(1) O(1) add/remove (at the tail) O(N) O(1) add/remove (in the interior of the structure) O(N) O(N) resize O(N) O(N) find by position O(N) O(N) find by target (element) O(N) O(N)
Reading anywhere in the linked list takes a linear order of growth
because we need to get to the element that should be read by navigating
from the first element to the next and the next etc. until we are there.
In the case of a doubly-linked list, we could also start at the end,
but that still leaves us with n/2 steps in the worst case.
-
Reading anywhere in the linked list takes a linear order of growth
because we need to get to the element that should be read by navigating
from the first element to the next and the next etc. until we are there.
In the case of a doubly-linked list, we could also start at the end,
but that still leaves us with n/2 steps in the worst case.
The time/effort required to insert an element at
the beginning of an array list is best described with:
// O(n)
The time/effort required to insert an element at
the end of a linked list is best described with:
// O(n)
Assume you need keep track of 100 000 elements
(the order matters). New elements are rarely added,
but you constantly need to read and overwrite
existing elements. What should you use?
// ArrayList
// Assume you have a stack as described below:
------
12
------
25
------
123
------
5
------
17
------
// What happens when the following statement is executed: push(17);
// The element 17 is added to the stack.
// Assume you have a large amount of data that needs to be added / removed from a stack.
// The job should be completed as quickly as possible.
// Which implementation should you use? The stack that is implemented using
// linked nodes or the stack that is implemented using an array?
// Answer: The stack that uses an array is faster.
// Node is an inner class of LinkedStackOfStrings.
// It is defined like this:
private class Node {
String item;
Node next;
}
// How many bytes of memory does a stack node use
// (excluding the string itself, which is owned by the client)?
// Answer: 40 (with margin: 0).
// Assume you have a queue as represented below:
------------------------------------------------------------------------
null null the best of times null null null
------------------------------------------------------------------------
0 1 2 3 4 5 6 7 8
------------------------------------------------------------------------
// What happens if the method dequeue() is called?
// The word "the" gets removed
// The array element on index 2 is set to null
// The field head gets updated
Suppose that you implement a queue using a null-terminated singly-linked list, maintaining a reference to the item least recently added (the front of the list) but not maintaining a reference to the item most recently added (the end of the list). What are the worst case running times for enqueue and dequeue? Linear time for enqueue and constant time for dequeue.
Assume a client needs to use a stack in a situation where immediate response is of critical importance (e.g. when implementing the landing of an auto-pilot). Which implementation should be used? A linked-list implementation.
// Consider the following code that defines a ResizingArrayStackOfStrings:
public class ResizingArrayStackOfStrings {
private String[] s;
private int N;
...
}
// How much memory does the stack use in the worst case as a function of N?
// ~ 32 N bytes
// If the array is completely full, how many bytes does the stack use to store N strings?
// ~ 8 N bytes
// Iterable has a method that returns an Iterator.
public interface Iterable<Item> {
Iterator<Item> iterator();
}
// An Iterator has methods: hasNext() and next().
public interface Iterator<Item> {
boolean hasNext();
Item next();
void remove(); // remove() is an optional method
}
Stack Iterator - Linked list implementation
import java.util.Iterator;
public class Stack<Item> implements Iterable<Item> {
...
public Iterator<Item> iterator() {
return new ListIterator();
}
private class ListIterator implements Iterator<Item> {
private Node current = first;
public boolean hasNext() {
return current != null;
}
public void remove() {
/* not supported */
}
public Item next() {
Item item = current.item;
current = current.next;
return item;
}
}
}
Stack Iterator - Array Implementation
public class Stack<Item> implements Iterable<Item> {
…
public Iterator<Item> iterator() {
return new ReverseArrayIterator();
}
private class ReverseArrayIterator implements Iterator<Item> {
private int i = N;
public boolean hasNext() {
return i > 0;
}
public void remove() {
/* not supported */
}
public Item next() {
return s[--i];
}
}
}
Recursions allow us to write clear and concise code.
There are four common pitfalls that need to be avoided:
Goal: evaluate infix expressions.
( 1 + ( ( 2 + 3 ) * ( 4 * 5 ) ) )
Dijkstra's two-stack algorithm is used to evaluate arithmetic expressions. To be more precise, it computes the same value if the operator occurs after the two values...
Applications of Dijkstra's algorithm:
public class Evaluate {
public static void main(String[] args) {
Stack<String> ops = new Stack<String>();
Stack<Double> vals = new Stack<Double>();
while (!StdIn.isEmpty()) {
String s = StdIn.readString();
if (s.equals("(")) ;
else if (s.equals("+")) ops.push(s);
else if (s.equals("*")) ops.push(s);
else if (s.equals(")")) {
String op = ops.pop();
if (op.equals("+")) vals.push(vals.pop() + vals.pop());
else if (op.equals("*")) vals.push(vals.pop() * vals.pop());
}
else vals.push(Double.parseDouble(s));
}
StdOut.println(vals.pop());
}
}
Correctness is when an algorithm encounters an operator surrounded by two values within parentheses, which leaves the result on the value stack...
( 1 + ( ( 2 + 3 ) * ( 4 * 5 ) ) )
... as if the original input were:
( 1 + ( 5 * ( 4 * 5 ) ) )
Repeating the argument:
( 1 + ( 5 * 20 ) )
( 1 + 100 )
101
// Extensions:
// more ops, precedence order, associativity.
Observation:
All of the parentheses are redundant!
1 2 3 + 4 5 * * +
When your class implements Comparable, the compareTo method of the class is defining the "natural" ordering of that object. That method is contractually obligated (though not demanded) to be in line with other methods on that object, such as a 0 should always be returned for objects when the .equals() comparisons return true. A Comparator is its own definition of how to compare two objects, and can be used to compare objects in a way that might not align with the natural ordering. For example, Strings are generally compared alphabetically. Thus the "a".compareTo("b") would use alphabetical comparisons. If you wanted to compare Strings on length, you would need to write a custom comparator.
In short, there isn't much difference. They are both ends to similar means. In general implement comparable for natural order, (natural order definition is obviously open to interpretation), and write a comparator for other sorting or comparison needs.
Specifies the natural order. Method: compareTo. How many parameter does compareTo have? One. Which values does compareTo return? Positive, zero, or negative if this object is greater, equal, or less than the argument passed.
Specifies an alternative order (not the natural order). Method: compare. How many parameters does compare have? Two. Which values does compare return? Positive, zero, or negative if the first argument is greater, equal, or less than the second argument.
- Class String provides a comparator.
How is it different from the natural order?
It ignores the case (upper-case / lower-case).
- Class Collections provides a comparator.
How is it different from the natural order?
It orders in reverse order (back to front based on the natural order).
The algorithm works as follows:
1: Find the minimum value in the list
2: Swap it with the value in the first position
3: Repeat the steps above for the remainder of the list
(starting at the second position and advancing each time)
N^2: all the time – no matter what 1 swap every time Good if we gonna minimize swaps- if we are concerned about memory
In order to sort a randomly ordered array
with no duplicate keys, insertion sort uses:
Mostly sorted array = O(N)
Not sorted array = O(N^2)
Insertion Sort asks the question:
Is my neighbor > than me?
Runtime: N lg N. Mergesort uses extra space proportional to N. A sorting algorithm in-place if it uses no more than c log N extra memory.
Merge sort is fast, but it requires more memory. Bottom-up mergesort requires no recursion
It is important to not create the auxiliary array in the recursive method because that could lead to extensive cost of extra array creation and poor performance. In order to be able to use merge sort all the elements in the array do not need to be unique (merge sort can thus have duplicates).
In general it is a good idea for Java programmers to use assertions.
// PROBLEM #1
For tiny arrays mergesort is too complicated.
There is too much overhead with the recursive calls.
One way to make mergesort even more efficient is to
cut off at a certain array size and use insertion sort.
Which cutoff value can increase the
performance of mergesort by about 20%?
// A: 7.
// PROBLEM #2
Assume you are merging the elements from the
auxiliary array back to the original array.
What happens if the elements on index i and j are identical?
// A: The element on index i gets copied and i gets incremented.
Merge sort in a prototype for algorithm design called "divide and conquer."
Comparator
Lower bounds for mergesort: N log N, N. Merge sort is optimal with respect to the number of compares. Merge sort is not optimal with respect to space usage. If additional information is known about the input elements the lower bound N lg N might not hold. A sorting algorithm is stable if equal items are never moved past each other.
Stable algorithms:
Which of the following should you do if you want to ensure that your implementation of the merge method results in a stable mergesort? If 2 keys are equal it takes the element from the left subarray. In the context of computational complexity the upper bound is a cost guarantee provided by some algorithm for X. It is an upper bound on how difficult it is to solve problem X. What does it mean if we say that sorting has an upper bound of ~ N lg N? It means that there is at least one algorithm that can sort in time proportional to N lg N.
// PROBLEM #1
Assume you are eager to discover the next big break-through
in algorithms. A team that works on the development of a sorting
algorithm that guarantees ⅓ N lg N compares offers you the
opportunity to join them as a research assistant.
Should you accept?
// No, fuck that.
Mergesort requires less comparisons than quicksort. In practice, however, quick sort is significantly faster than mergesort.
Quicksort partitioning demo Repeat until i and j pointers cross. ・Scan i from left to right so long as (a[i] < a[lo]). ・Scan j from right to left so long as (a[j] > a[lo]). ・Exchange a[i] with a[j]. When pointers cross. ・Exchange a[lo] with a[j].
private static int partition(Comparable[] a, int lo, int hi) {
int i = lo, j = hi+1;
while (true) {
while (less(a[++i], a[lo])) // find item on left to swap
if (i == hi) break;
while (less(a[lo], a[--j])) // find item on right to swap
if (j == lo) break;
if (i >= j) break; // check if pointers cross
exch(a, i, j); // swap
}
exch(a, lo, j); // swap with partitioning item
return j; // return index of item now known to be in place
}
Partitioning in-place. Using an extra array makes partitioning easier (and stable), but is not worth the cost. Terminating the loop. Testing whether the pointers cross is trickier than it might seem. Equal keys. When duplicates are present, it is (counter-intuitively) better to stop scans on keys equal to the partitioning item's key. Preserving randomness. Shuffling is needed for performance guarantee. Equivalent alternative. Pick a random partitioning item in each subarray. Quicksort Gets the sort done in-place Mergesort Is stable Mergesort Requires extra space for sorting Quicksort Needs to be shuffled before sorting
Assume you partition an array that includes duplicate keys.
When i or j reach a key that is equal to the partitioning key, we should stop.
Assume you have an ordered array of unique elements (no duplicates).
You attempt to quicksort this array without shuffling.
Where will the first partitioning element (the first pivot) end up? At the front.
Quicksort is faster than Mergesort. But how much faster?
Let's assume sorting a very large array with mergesort takes 36 minutes.
How long would it approximately take to sort the same array with quicksort? 24 minutes.
Assume you have an array of distinct elements and you just partitioned it exactly once.
Where is the smallest item in the array? Consider what
happens if the pivot is the smallest element...
It is left of the partitioning element (pivot) or on the left-most postion.
// Assume you have an implementation of quicksort that
// does the random shuffling before the actual sort starts.
// Quicksort can still go quadratic.
// Using randomized quicksort with 3-way
// partitioning can reduce the runtime
// from linearithmic to linear for a broad
// class of applications.
// Assume you have tens of millions of sensor measurements hat have already been sorted.
// Every day about 1000 new measurements are added.
// Before the update can be considered complete the list has to be sorted again.
// You want to find the fastest algorithm that uses the least amount of memory.
// What sorting algorithm should you use? Insertion sort.
// Assume you need a sorting algorithm for a huge amount of data.
// Because of the size of your data set you want to use the
// fastest algorithm that uses the least amount of memory.
// Which sorting algorithm should you choose? Quicksort
To remove an element from the heap use the sink method. When removing an element it is important to avoid loitering. It's not a good idea to create defensive copies of immutable instance fields when implementing a heap – we need to make sure the client cannot change a key once it has been added to the heap.
There are two ways to implement priority queues – ordered and unordered arrays. However, they have significant drawbacks:
Applications that use priority queues:
Java immutable data types: String, Integer, Vector, Color...
Java mutable data types: StringBuilder, Randomized Queue, Queue, Stack, Priority Queue...
// PROBLEM #1
A symbol table can be implemented by using an ordered array.
Assume you have such a symbol table and N elements have already been inserted.
What is the average cost of searching and inserting?
Answer: log N and N / 2
// PROBLEM #2
A symbol table can be implemented by using an unordered linked list.
Assume you have such a symbol table and N elements have already been inserted.
What is the average cost of searching and inserting?
Answer: N / 2 and N
One of the following is not a BST. Which one is it?
The answer is D, since the letters might not be listed in order.
In which order are the elements of this BST returned when
using the iterator to access all the elements?
1 3 4 6 7 8 10 13 14
1. Search for the node <i>t</i> that contains the key <i>k</i>.
2. If <i>t</i> has 2 children, find the successor <i>x</i> of <i>t</i>.
3. Delete <i>x</i> from <i>t</i>'s right subtree.
4. Put <i>x</i> in the position where <i>t</i> used to be.
When alternating inserting and deleting elements using Hibbard Deletion, over time the BST becomes less balanced. Deleting an element from a BST using Hibbard deletion takes time proportional to sqrt(N).
PROBLEMS/ANALYSIS
During the insertion of elements (keys) we sometimes end up with
red links that are leaning in the wrong direction.
What should be done?
––> Perform one or more rotations. This will re-align the tree.
Assume you have a left-leaning red-black BST that includes N elements (keys).
All rotations in a left-leaning red-black tree are local operations.
What does 'local' mean in this context?
––> Rotations take a constant amount of operations that is independent of N.
While performing certain operations in left-leaning red-black BSTs you
sometimes encounter a node that has 2 red links coming out.
––> That corresponds to a (temporary) 4-node of a 2-3 search tree.
What causes the height of a 2-3 tree to increase?
––> Splitting a temporary 4-node at the root.
Assume you have a 2-3 tree, that contains N elements (N keys).
Which big-O best describes the splitting of a temporary 4-node?
––> O(1).
A red-black tree is a kind of self-balancing binary search tree. Each node stores an extra bit representing "color" ("red" or "black"), used to ensure that the tree remains balanced during insertions and deletions.
// TODO
Fast range count and range search but O(N) on insert. Binary Search Tree is O(log N) on insert and range count, O( R + log N) on range search. Unordered List Fast insert but O(N) on range count and range search.
// Which of the following methods implements
// 1D range count? Answer: C.
// A)
public int rangeCount (Key lo, Key hi) {
if (contains (hi)) {
return hi - lo + 1;
} else {
return hi - lo;
}
}
// B)
public int rangecount (Key lo, Key hi) {
if (contains(hi)) {
return rank(hi) - rank(lo);
} else {
return rank(hi) - rank(lo) - 1;
}
}
// C)
public int rangeCount (Key lo, Key hi) {
if (contains(hi)) {
return rank(hi) - rank(lo) + 1;
} else {
return rank(hi) - ranki(lo).
}
}
// D)
public int rangeCount (Key lo, Key hi) {
return min(hi) - max(lo);
}
2D Trees:
KD Trees:
Assume we have N points in a given space and
we want to divide the space up into a M x M grid.
// What would be a good rule of thumb when choosing
// the right size of M for the M x M grid?
––> M should be sqrt(N).
// Given N horizontal and vertical line segments, find all intersections.
// Match the order of growth of the running time to the algorithm.
Quadratic ––> Check all pairs of line segments for intersection.
Linearithmic ––> sweep-line algorithm.
A hash table is a data structure that allows for the very fast retrieval of data, no matter how much data there is. For that reason hash tables are widely used in database indexing, compilers, authentication, caching, compilation error checking and much more. The insertion, deletion and retrieval of data from a hash table occurs in constant time (best case).
Consider a simple one-dimensional array variable to find an item in the list: you can employ a brute force approach such as a linear search. This would involve checking each item in term for a very big array – this could take a very long time... Now, suppose you want to retrieve a value from an element in the array and you happen to know the index number of that element. You can look up the value very quickly, indeed. You know the index number is independent of the size of the array and independent of its position in the array. But how can you know which elements of the array contains the value? The answer is that each index number can be calculated using the value itself. So that the index number is in some way related to the data.
Solving a collision by placing an item somewhere other than its calculated address is called open addressing because every location is open to any item. Open addressing can use a variety of techniques to decide where to place an item. Linear probing: if the calculator address is occupied, then a linear search is used to find the next available slot. If linear probing gets to the end of the array and it still can't find a free space, then it might cycle around to the beginning of the array and continue searching from there to look up an item in this hash table – the hash function is applied again, but if we've had collisions and some items are not in their calculator positions, then finding an item will also involve linear probing – that is a linear search.
The more items there are in a hash table, the more likely you are to get collisions when you insert even more data. One way to do with this is to make the hash table bigger than needed for the total amount of data. Make it big, such that only 70% of the hash table is ever occupied.
The size of the data array is known as the load factor. If the hash table is implemented as a resizable dynamic data structure, it could be made to increase in size automatically when the load factor reaches a certain threshold. In an ideal world, every item would be stored in the hash table according to its calculated index. In this best case scenario, the time-taking to find any particular item is always the same. Imagine a worst case scenario, too – depending on the nature of the data used to calculate the index values and depending on the appropriateness of the hash algorithm, some items may require a linear search of the whole table in order to be found.
Another way to do with collisions is known as "separate chaining." What we have here is a pointer to the first node of one resolving collisions. If the calculator address is occupied, then linear probing involves trying the next place along, and if necessary: the next and then the next, and so on until an empty slot is eventually found. However, this can result in what is known as primary clustering. In other words, keys might bunch together inside the array while large proportions of it remain unoccupied. A good resolution may involve looking at say every third slot along until the free space is found (plus three rehash).
When choosing or writing a hash function, there are some objectives to bear in mind:
Left-leaning red-black BST You write code for an auto-pilot. It is very important that all symbol table look-ups execute in the specified time. Hash Table You write code for a game that uses a lot of animation. Speed is your first concern when selecting a symbol table implementation.
// Assume a new item with a new key is added to
// the hash table, that uses separate chaining.
// What is the description of the process?
--> map key to integer i
--> insert new item at the front of i-th chain
// -------------------------------------------------------------------
// Assume you have 1000 items that need to be stored in a hash
// table and you use separate chaining. Which array size should
// you choose for your hash table if you want constant time searches,
// and you don't want to allocate too much extra memory?
--> 200
// For complicated keys it can be expensive to calculate a good hash code.
// There are different ways to respond to such a situation. What options should you consider?
- Consider using left-leaning red-black BSTs for your symbol table implementation.
- Store the hash code so it has to be calculated only once.
// -------------------------------------------------------------------
// The performance of hash tables is based on the uniform hashing assumption.
// This can cause issues in a number of situations.
// Below you find 2 examples where the hash function can cause a problem:
- If the hash code function is known and exploited to create a denial-of-service attack.
- If a guaranteed performance is required (e.g. hard real time systems).
The Birthday problem states that we can expect two balls in the same bin after N tosses. Once we have approximately 24 people in a room we can expect that 2 of them celebrate their Birthdays on the same day.
PROBLEM #1
// There are a number of issues that come with hashing.
// Below you find four options.
// Only three of them describe issues related to hashing.
– compute the hash function
– override compareTo
– test for equality
– collision resolution
// override compareTo, does NOT belong here.
PROBLEM #2
// Here are 3 suggestions to create a simple hash function for employees.
// Which one should you choose?
– Use the zip code as a hash function for employees
– Use the last 4 digits of the social security number as a hash function for employees
– Use the area code as a hash function for employees
// Use the last 4 digits of the social security number as a hash function for employees.
// All classes in Java inherit the method hashCode(), which returns an integer.
// The Java language has built-in rules for the implementation of hashCode.
// Below you find some statements regarding the method hashCode.
// One of them is false. Which of the following statements is NOT true?
if ( x.equals(y))
then we know for sure that x.hashCode() == y.hashCode()
if (x.hashCode() != y.hashCode())
then we know for sure that (! x.equals(y))
if (! x.equals(y))
then we know for sure that x.hashCode() != y.hashCode()
// Option 3 is false.
Assume you have a linear probing hash table with uniform hashing.
The hash table has size M and the number of keys in the table is N (0 < N < M)
--> The average number of misses before a key can be inserted depends on N / M.
// -------------------------------------------------------------------
If the array is too large compared to the
number of keys, we use up too much memory.
If the number of keys is too large compared to
the size of the array, the search is too slow.
Let's assume that the size of the array is M and the number of keys is N.
Therefore,
if N / M ~ 0.5, the number of probes for a search hit is about 3/2.
The best description of searching in a linear probing hash table is:
// CLASS EXERCISE
0 1 2 3 4 5
E
A E
S E
Y E
Y Q
Y U Q
Y U E
S U E
S T U E
S T I E
S T O E
S N O E
E (5 * 13) % 6 = 5
A (1 * 13) % 6 = 1
S (19 * 13) % 6 = 1
Y (25 * 13) % 6 = 1
Q (17 * 13) % 6 = 5
U (21 * 13) % 6 = 3
E (5 * 13) % 6 = 5
S (19 * 13) % 6 = 1
T (20 * 13) % 6 = 2
I (9 * 13) % 6 = 3
O (15 * 13) % 6 = 3
N (14 * 13) % 6 = 2
Examples of exception filters:
Description of the dictionary application:
About csv files:
An undirected graph is a set of nodes and a set of links between the nodes. Each node is called a vertex, each link is called an edge, and each edge connects two vertices. The order of the two connected vertices is unimportant. An undirected graph is a finite set of vertices together with a finite set of edges.
The Graph API provides an overloaded constructor that reads in a graph from an input stream.
What's the best way to describe the expected file format?
v1, .. vn are integers representing vertices that are connected by edges.
Answer:
i) number of vertices
ii) number of edges
iii) v1 v2
.
.
.
...
// ------------------------------------------------------------------------------------------
Prof. Sedgewick's graph API represents vertices with integers between 0 and V - 1.
However, sometimes graphs use other types like letters or
Strings to represent vertices. How can we use Prof. Sedgewick's
graph classes when the client provides the vertices as Strings?
e.g.:
graph data where the vertices are cities and the connections indicate
whether there exists a train connection between the cities.
Answer:
// You could use a symbol table that matches the
// strings to integers and then use his graph classes
There are three different common ways to represent a graph: Graph Memory Requirement Add Edge Edge Between V and W Iterating over vertices adjacent to V List of Edges E 1 E E Adjacency Matrix V^2 1 * 1 V Adjacency Lists E + V 1 degree(v) degree(v) Iterating over all the vertices that are adjacent to vertex v: List of Edges To iterate through the adjacent vertices you have to go through all the edges. Adjacency Matrix To iterate through the adjacent vertices you have to go through all the vertices. Adjacency Lists To iterate through the adjacent vertices you only have to go through the adjacent vertices. Adjacency List An adjacency list represents a graph as an array of linked lists. The index of the array represents a vertex and each element in its linked list represents the other vertices that form an edge with the vertex. The simplest adjacency list needs a node data structure to store a vertex and a graph data structure to organize the nodes. We stay close to the basic definition of a graph - a collection of vertices and edges {V, E}. For simplicity, we use an unlabeled graph as opposed to a labeled one i.e. the vertices are identified by their indices 0, 1, 2, 3. We use Java Collections to store the Array of Linked Lists.
Advantages:
Disadvantages:
// We can represent this graph in the form of a
// linked list on a computer as shown below:
0 –> 1 –> 2 –> 3
1 –> 0 –> 2
2 –> 0 –> 1
3 –> 0
0, 1, 2, 3 are the vertices and each of them forms
a linked list with all of its adjacent vertices.
For instance:
vertex 1 has two adjacent vertices 0 and 2.
Therefore:
1 is linked with 0 and 2 in the figure above.
Depth First Search: puts unvisited vertices in a bag.
Breadth First Search: puts unvisited vertices in a linked list.
Class Path has two arrays (fields) that keep track of vertex
related information while executing the depth-first search.
- an array that stores for each vertex whether the vertex has already been visited
- an array that stores the prior vertex (where we came from)
Class Path returns the path as an Iterable<Integer>.
Iterable is an interface.
What is actually returned in an instance of a class
that implements the interface Iterable<E>.
// What is the type of the variable
// that stores and return the path?
// Stack<Integer>
Digraph Examples:
Digraph Set of vertices connected pairwise by directed edges Indegree Number of arrows (directed edges) coming in to the vertex Outdegree Number of arrows (directed edges) leaving the vertex
Both the Graph API and the Digraph API include the following method:
public Iterable<Integer> adj (int v)
Many real-world graphs are large and sparse. Because of that, we use an adjacency list to implement the Digraph. The Digraph has a method called reverse. It reverses the whole graph.
Breadth first search (BFS) computes the shortest path from s to all other vertices in linear time.
Depth First Search allows us to write simple solutions to a number of digraph problems.
Topological order:
About DAGs:
In order to implement the topological sort we do the following:
Assume you use a digraph to schedule a set of tasks, that have precedence constraints; in such a graph the tasks should be the vertices and the constraints should be the edges.
A minimum spanning tree is the minimum weight set of edges that connects all the vertices in a weighted graph. A spanning tree is a subgraph of an undirected graph with positive edge weights. An edge-weighted graph is a graph where we associate weights or costs with each edge. A minimum spanning tree (MST) of an edge-weighted graph is a spanning tree whose weight (the sum of the weights of its edges) is no larger than the weight of any other spanning tree.
Adding an edge that connects two vertices in a tree creates a unique cycle. Removing an edge from a tree breaks it into two separate subtrees.
Visit the documentation to learn more.
An edge-weighted graph is a vertex-indexed arrays of bags of edges. Each edge is stored exactly twice.
// How to print all the edges of a minimum spanning tree?
EdgeWeightedGraph G = new EdgeWeightedGraph(in);
MST mst = new MST(G);
// Use the iterator mst.edges() to print all edges.
The cut property states that, given any cut, the crossing edge of min weight is in the MST. In the context of a minimum spanning trees, a cut in a graph is a partition of its vertices into 2 sets.
The following method colors black all edges in the the MST of any connected edge-weighted graph with V vertices: Starting with all edges colored gray, find a cut with no black edges, color its minimum-weight edge black, and continue until V-1 edges have been colored black.
2 simplifying assumptions for minimum spanning trees (MST):
// a) edge weights are distinct
// b) graph is connected
One of the statements below is NOT true.
Which one is the incorrect statement?
If b) is not true a minimum spanning forsest is computed
If a) is not true there can be multiple MSTs
If b) is not true a MST does not exist
If a) and b) are true a MST exists and it is unique
// The incorrect answer is:
// If b) is not true a MST does not exist.
Greedy algorithm does compute the MST.
However, choosing the cut and finding the min-weight edge can be
difficult and expensive - especially in very large graphs.
The video lists 3 efficient implementations that differ in
the way they choose the cut and find the min-weight edge.
Two of them are listed below. Select BOTH of those
implementations among the options below.
// Kruskal's algorithm & Prim's algorithm
In the greedy MST algrithm we start with all edges colored gray.
Then we do the following:
find a cut with no black crossing edges;
color its min-weight edge black.
The time to compute a minimum spanning tree using Kruskal's algorithm is proportional to E log E. Such algorithm can be used to group V objects into k clusters. It is also true that a randomized linear-time minimum spanning tree algorithm exist.
Prim's algorithm always connects to the tree the edge that is closest to the tree.
The implementation of Lazy Prim's algorithm computes the minimum spanning tree in time proportional to E log E. However, it requires extra space proportional to E. It is called lazy implementation because it allows edges on the priority queue that are obsolete.
In order to be able to efficiently update elements on the priority queue, we use an indexed priority queue called IndexMinPQ.
Prim's algorithms requires V insert, V delete-min and E decrease-key.
Dijkstra's algorithm computes a shortest paths tree (SPT) in any edge-weighted digraph that has no negative weights. Dijkstra's algorithm and Prim's algorithm are essentially the same. The main difference is the rule that is used to select the next vertex for the tree.
Dijkstra's algorithm chooses the closest vertex to the source via a directed path. Prim's algorithm chooses the closest vertex to the tree via an undirected edge. Dijkstra's algorithm, Prim's algorithm, Kruskal's algorithm, DFS, and BFS all compute a graph's spanning tree.
To implement Dijkstra's algorithm, we use an IndexMinPQ
To relax an edge v –> w means:
// Assume you have an object of type SP. It has been instantiated like this:
SP sp = new SP (G, s);
// Assume that v is a vertex of graph G.
// What does the following method call return?
sp.pathTo(v)
// All edges along the shortest path from s to v.
// Assume G is an edge-weighted digraph.
// The array distTo[] includes the shortest path distances from s if and only if:
distTo[s] = 0
// for each vertex v, distTo[v] is the length of some path from s to v.
// Which of the following options below is the missing condition of the proposition above?
for each vertex w, distTo[w] <= distTo[v] + e.weight()
for each edge e = v -> w, distTo[w] <= distTo[v] + e.weight()
for each vertex v, distTo[w] <= distTo[v] + e.weight()
for each edge e = v -> w, distTo[v] <= distTo[w] + e.weight()
// Answer:
for each edge e = v -> w, distTo[w] <= distTo[v] + e.weight()
Let G be an edge-weighted digraph and s a designated vertex of G. A shortest-path tree (SPT) is a subgraph from G, that contains all the vertices reachable from s, and that forms a directed tree rooted at s such that every tree path is a shortest path in the digraph. For each G and s there exists a corresponding shortest-path tree.
Lectures from the book "Algorithms, 4th Edition" and API's documentation.