https://www.topcoder.com/community/data-science/data-science-tutorials/how-to-find-a-solution/
The Canvas contains 5 major areas: Constraints, Ideas, Complexities, Code, and Tests. Taken together, they capture everything you need to worry about when it comes to algorithm design problems.
Area #1: Constraints
Straight-forward problems that don’t require any special technique (e.g. simulation, searching, sorting etc.)
In most cases, these problems will ask you to perform some step by step, straight-forward tasks. Their constraints are not high, and not too low. In most cases the first problems (the easiest ones) in topcoder Single Rounds Matches are of this kind. They test mostly how fast and properly you code, and not necessarily your algorithmic skills.
In most cases, these problems will ask you to perform some step by step, straight-forward tasks. Their constraints are not high, and not too low. In most cases the first problems (the easiest ones) in topcoder Single Rounds Matches are of this kind. They test mostly how fast and properly you code, and not necessarily your algorithmic skills.
Most simple problems of this type are those that ask you just to execute all steps described in the statement.
Breadth First Search (BFS)
Problems that use BFS usually ask to find the fewest number of steps (or the shortest path) needed to reach a certain end point (state) from the starting one. Besides this, certain ways of passing from one point to another are offered, all of them having the same cost of 1 (sometimes it may be equal to another number). Often there is given a N x M table (formed of N lines and M columns) where certain cells are passable and others are impassable, and the target of the problem is to find the shortest time/path needed to reach the end point from the start one. Such tables may represent mazes, maps, cities, and other similar things. These may be considered as classical BFS problems. Because BFS complexity is in most cases linear (sometimes quadratic, or N logN), constraints of N (or M) could be high – even up to 1 million.
Problems that use BFS usually ask to find the fewest number of steps (or the shortest path) needed to reach a certain end point (state) from the starting one. Besides this, certain ways of passing from one point to another are offered, all of them having the same cost of 1 (sometimes it may be equal to another number). Often there is given a N x M table (formed of N lines and M columns) where certain cells are passable and others are impassable, and the target of the problem is to find the shortest time/path needed to reach the end point from the start one. Such tables may represent mazes, maps, cities, and other similar things. These may be considered as classical BFS problems. Because BFS complexity is in most cases linear (sometimes quadratic, or N logN), constraints of N (or M) could be high – even up to 1 million.
Problem hints:
- Words can be considered as states. There are at most 26^4 different words composed of 4 letters (thus a linear complexity will run in allowed time).
- There are some ways to pass from one state to another.
- The cost of passing from a state to another is always 1 (i.e. a single button click).
- You need to find the minimum number of steps required to reach the end state from start state.
Everything indicates us that it’s a problem solved by the help of a BFS. Similar things can be found in any other BFS problems. Now let’s see an interesting case of BFS problems.
Problem hints: At first sight this may seem like dynamic programming or backtracking. But as always, take a look into the text of the statement. After a while you should observe the following things:
- A table is given.
- The knight can jump from one cell to some of its neighbors.
- The cost of passing from a cell to another is always 1 (just one jump).
- You need to find the minimum number of steps (jumps).
Given this information we can see that the problem can be easily solved by the help of BFS. Don’t get confused by the fact that connected points are represented by unconnected cells. Think of cells as points in a graph, or states (whatever you want) – and in order to pass from one point to another, the knight should be able to jump from the first to the second point.
Notice again that the most revealing hint about the BFS solution is the cost of 1 for any jump.
Train yourself in finding the hints of a BFS problem in following examples:
Other example(s):
RevolvingDoors – SRM 223 Div 1
WalkingHome – SRM 222 Div 1
TurntableService – SRM 219 Div 1
RevolvingDoors – SRM 223 Div 1
WalkingHome – SRM 222 Div 1
TurntableService – SRM 219 Div 1
Flood Fill
Sometimes you may encounter problems that are solved by the help of Flood Fill, a technique that uses BFS to find all reachable points. The thing that makes them different from BFS problems described above is that a minimum path/cost is not needed.
Sometimes you may encounter problems that are solved by the help of Flood Fill, a technique that uses BFS to find all reachable points. The thing that makes them different from BFS problems described above is that a minimum path/cost is not needed.
For example, imagine a maze where 1 represents impassable cells and 0 passable cells. You need to find all cells that are reachable from the upper-left corner. The solution is very simple – take one-by-one a visited vertex, add its unvisited neighbors to the queue of visited vertices and proceed with the next one while the queue is still populated. Note that in most cases a DFS (Depth First Search) will not work for such problems due to stack overflows. Better use a BFS. For inexperienced users it may seem harder to implement, but after a little training it becomes a "piece of cake". A good example of such problem would be:
Problem hints: What do we have here?
- A map (table)
- Certain points are impassable (those covered by given rectangles)
- Contiguous areas need to be found
It is easy to understand that a problem with such a statement needs a Flood Fill. Usually problems using it are very easy to detect.
Brute Force and Backtracking
I have placed these 2 techniques in the same category because they are very similar. Both do the same thing – try all possible cases (situations) and choose the best one, or count only those that are needed (depending on the problem). Practically, Backtracking is just more advanced and optimized than Brute Force. It usually uses recursion and is applied to problems having low constraints (for example N<=20).
I have placed these 2 techniques in the same category because they are very similar. Both do the same thing – try all possible cases (situations) and choose the best one, or count only those that are needed (depending on the problem). Practically, Backtracking is just more advanced and optimized than Brute Force. It usually uses recursion and is applied to problems having low constraints (for example N<=20).
Brute Force
There are many problems that can be solved by the help of a simple brute force. Note that the limits must not be high. How does a brute force algorithm work? Actually, it tries all possible situations and selects the best one. It’s simple to construct and usually simple to implement. If there is a problem that asks to enumerate or find all possible ways (situations) of doing a certain thing, and that doesn’t have high limits – then it’s most probably a brute force problem.
There are many problems that can be solved by the help of a simple brute force. Note that the limits must not be high. How does a brute force algorithm work? Actually, it tries all possible situations and selects the best one. It’s simple to construct and usually simple to implement. If there is a problem that asks to enumerate or find all possible ways (situations) of doing a certain thing, and that doesn’t have high limits – then it’s most probably a brute force problem.
Problem hints: Well, this is one of the easiest examples. So which are the hints of this statement?
- You need to find all possible situations (positions) that satisfy a certain rule (threatens all given pieces).
- The limits are very low – only 8 knights are at most given.
It’s a common Brute Force problem’s statement. Note that x and y limits are not relevant, because you need only try all positions that threaten one of the knights. For each of these positions see if the knight placed at that position threatens all others too.
Another interesting problem would be:
Problem hints: And again one of the most important hints is the low limit of the size of the grid – only 50. This problem is possible to be solved with the help of the Brute Force because for each cell you can try to find the circle whose center is situated in that cell and that respects the rules. Among all of these circles found, select the one that has the greatest radius.
Complexity analysis: there are at most 50×50 cells, a circle’s radius is an integer and can be at most 25 units, and you need a linear time (depending on your implementation) for searching the cells situated on the border of the circle. Total complexity is low and thus you can apply a simple Brute Force here.
Backtracking
This technique may be used in many types of problems. Just take a look at the limits (N, M and other main parameters). They serve as the main hint of a backtrack problem. If these are very small and you haven’t found a solution that’s easier to implement – then just don’t waste your time on searching it and implement a straight-forward backtracking solution.
This technique may be used in many types of problems. Just take a look at the limits (N, M and other main parameters). They serve as the main hint of a backtrack problem. If these are very small and you haven’t found a solution that’s easier to implement – then just don’t waste your time on searching it and implement a straight-forward backtracking solution.
Usually problems of this kind ask you to find (similarly to Brute Force):
- Every possible configuration (subset) of items. These configurations should respect some given rules.
- The "best" configuration (subset) that respects some given rules.
Problem hints:
- First look at the constraints – there are at most ONLY 6 people! It’s enough for generating all possible permutations, sets etc.
- There are different possible ways to pass the people from one side to another and you need to find the best one.
This is of course a problem solved with a backtracking: at the beginning choose any 2 people to pass the bridge first, and after that at each step try to pass any of those that have been left on the start side. From all these passages select the one that needs the smallest amount of time. Note that among persons that have passed over the bridge, the one having the greatest speed should return (it’s better than returning one having a lower speed). This fact makes the code much easier to implement. After having realized these things – just code the solution. There may be others – but you will lose more time to find another than to code this one.
Problem hints: Only 9 numbers are given at most; and every distinct way (configuration) to arrange the numbers so that they form a magic number square should be found. These are the main properties of a Backtracking problem. If you have observed them – think about the code. You can generate all permutations of numbers and for each of them check if it forms a magic square. If so – add it to the answer. Note that it must be unique. A possible way to do that – is to have a list of earlier found configurations, thus for each new magic square check if it exists in that list and if it doesn’t – add it to the answer. There will not be many distinct magic squares, thus no additional problems will appear when applying this method.
Dynamic Programming
Quite a few problems are solved with the help of this technique. Knowing how to detect this type of problem can be very valuable. However in order to do so, one has to have some experience in dynamic programming. Usually a DP problem has some main integer variables (e.g. N) which are neither too small, nor too big – so that a usual DP complexity of N^2, N^3 etc. fits in time. Note that in the event that N is very small (for TC problems usually less than 30) – then it is likely the problem is not a DP one. Besides that there should exist states and one or more ways (rules) to reach one greater state from another lower one. In addition, greater states should depend only upon lower states. What is a so-called state? It’s just a certain configuration or situation. To better understand dynamic programming, you may want to read this article.
Problem hints:Quite a few problems are solved with the help of this technique. Knowing how to detect this type of problem can be very valuable. However in order to do so, one has to have some experience in dynamic programming. Usually a DP problem has some main integer variables (e.g. N) which are neither too small, nor too big – so that a usual DP complexity of N^2, N^3 etc. fits in time. Note that in the event that N is very small (for TC problems usually less than 30) – then it is likely the problem is not a DP one. Besides that there should exist states and one or more ways (rules) to reach one greater state from another lower one. In addition, greater states should depend only upon lower states. What is a so-called state? It’s just a certain configuration or situation. To better understand dynamic programming, you may want to read this article.
- Two main integer variables are given (N and S). These are neither too small, nor are they too big (i.e. a complexity of N*S fits in time).
- A state can be defined as the minimum number of coins needed to reach a certain sum.
- A sum (state) i depends only on lower sums (states) j (j<i).
- By adding a coin to a certain sum – another greater sum is reached. This is the way to pass from one state to another.
Thus all properties of a DP problem are uncovered in this statement. Let’s see another (slightly harder) DP problem
Problem hints:
- There are N numbers given (1<=N<=50), thus N isn’t too small, nor too big.
- A state (i,d) can be defined as the length of the longest zig-zag subsequence ending with the i-th number, for which the number before the last one is smaller than it for d=0, and bigger for d=1.
- A state i (i.e. a subsequence ending with the i-th number) depends only on lower states j (j<i).
- By adding a number to the end of a subsequence – another bigger (greater) subsequence is created. This is the way to pass from one state to another.
As you can see – this statement has almost the same traits (pattern) as in the previous problem. The most difficult part in identifying a DP problem statement is observing/seeing the states with the properties described above. Once you can do that, the next step is to construct the algorithm, which is out of the scope of this article.
Other example(s):
ChessMetric – 2003 TCCC Round 4
AvoidRoads – 2003 TCO Semifinals 4
FlowerGarden – 2004 TCCC Round 1
BadNeighbors – 2004 TCCC Round 4
ChessMetric – 2003 TCCC Round 4
AvoidRoads – 2003 TCO Semifinals 4
FlowerGarden – 2004 TCCC Round 1
BadNeighbors – 2004 TCCC Round 4
Hard Drills:
Maximum Flow
In many cases it’s hard to detect a problem whose solution uses maximum flow. Often you have to create/define graphs with capacities based on the problem statement.
Maximum Flow
In many cases it’s hard to detect a problem whose solution uses maximum flow. Often you have to create/define graphs with capacities based on the problem statement.
Here are some signs of a Maximum Flow problem:
- Take a look at the constraints, they have to be appropriate for a O(N^3) or O(N^4) solution, i.e. N shouldn’t be greater than 500 in extreme cases (usually it’s less than 100).
- There should be a graph with edges having capacities given, or you should be able to define/create it from data given in the statement.
- You should be finding a maximum value of something.
As you can see – it’s a straight-forward maximum flow problem: water pipes represent edges of the graph, their junctions – vertices; and you have to find the maximum value of amount of water that can flow from start to end vertex in a unit of time.
Optimal Pair Matching:
These problems usually have a list of items (from a set A) for which other items (from a set B) should be assigned under some rules, so that all (or a maximum possible number of) items from set A have to each be assigned to a certain item from set B.
These problems usually have a list of items (from a set A) for which other items (from a set B) should be assigned under some rules, so that all (or a maximum possible number of) items from set A have to each be assigned to a certain item from set B.
Mixed:
Some problems need other techniques in addition to network flows.
Some problems need other techniques in addition to network flows.
Problem hints: By reading this problem one can simply understand the main idea of the solution – it should be something similar to optimal pair matching, because each car (point from a set A) should be assigned to a parking lot (point from a set B) so that all are assigned and that there is at most one car assigned to a parking lot. Additionally, there can be cars that can’t reach certain parking lots, thus some pairs of points (one point from A and the other from B) are not connected. However a graph should be created for optimal pair matching. The way to make it is clear – an edge exists between a car and a parking lot if only there is a path between them, and its cost is equal to the shortest distance needed for the car to reach the parking lot. The next step of the solution is a binary search on the longest edge. Although it may be out of the scope of this article, I will provide a short explanation: At each step delete those edges of the initial graph that have costs greater than a certain value C (Note that you’ll have to save the initial graph’s state in order to repeat this step again for other C values). If it’s possible in this case to assign all the cars to parking lots – then take a smaller C, and repeat the same operation. If not – take a greater C. After a complete binary search, the smallest C for which a complete assignment is possible will be found. This will be the answer.
Linear Programming (Simplex)
Most of the common traits of problems solved with the help of the linear programming technique are:
Most of the common traits of problems solved with the help of the linear programming technique are:
- You are given collection of items having different costs/weights. There is a certain quantity of each item that must be achieved.
- A list of sets is given. These sets are composed of some of the available items, having certain quantities of each of them. Each set has a certain cost.
- The goal of the problem is to find an optimal combination (the cheapest one) of these sets so that the sum of quantities of each of the items they have is exactly the one needed to achieve.
At first it may seem confusing, but let’s see an example:
Problem hints:
- A collection of items (chemicals).
- A list of sets (available mixtures), each containing certain amounts of each of the items, and having a certain cost.
- You need to find the lowest price of the desired collection of items achieved by the combination of the available sets. More than that – you can take also non-integral amounts of mixtures.
These are exactly the traits described above.
https://www.hiredintech.com/classrooms/algorithm-design/lesson/31
The reasons are many: lack of theoretical knowledge, lack of practice, and an unsystematic approach to algorithmic problems. The Internet is full of “cookbook” recipes for how to solve each problem... As if reading through problems and memorizing solutions is going to do anyone any good!
by applying the right framework and by using sites like TopCoder for practice, algorithm design questions quickly become incredibly addictive.
The Algorithm Design Canvas captures our process for tackling algorithm design problems.
It's the most convenient way to represent algorithmic thinking. Every algorithmic problem, big or small, easy or hard, should eventually end up as a completed canvas.
The Canvas contains 5 major areas: Constraints, Ideas, Complexities, Code, and Tests. Taken together, they capture everything you need to worry about when it comes to algorithm design problems.
Area #1: Constraints
The Constraints area is where you fill in all constraints of the problem. This includes things like how large the input array can be, can the input string contain unicode characters, is the robot allowed to make diagonal moves in the maze, can the graph have negative edges, etc.
Your very first task when analyzing an algorithm design problem is to uncover all its constraints and write them down in this area.
Area #2: Ideas
After you've identified all constraints, you go into idea generation. Typically during an interview you discuss 1 to 3 ideas. Often times you start with one, explain it to the interviewer, and then move on to a better idea.
Your next goal is to fill in a succinct description of your idea: short and sweet so that any interviewer is able to understand it.
Area #3: Complexities
Each idea has two corresponding Complexity areas: Time and Memory. For every algorithm you describe, you will need to be able to estimate its time and memory complexity. Further in this course we will talk about the Time vs Memory trade-off and why it's key to any algorithm design problem.
Area #4: Code
After you've identified the problem's constraints, discussed a few ideas, analyzed their complexities, and found one that both you and your interviewer think is worth being implemented, you finally go into writing code.
Writing code at interviews is fundamentally different from writing code in your IDE
Area #5: Tests
Finally, you move on to writing test cases and testing your code. Many people completely ignore this step. This is not smart at all.
You never want to solve a problem that’s ill-defined. That’s why the very first thing you should turn your head to when it comes to algorithm design problems is the problem’s constraints.
The importance of constraints
Developing an algorithm that sorts 50 numbers is going to be fundamentally different from developing an algorithm that sorts 5 billion strings, each one 1 million characters long. So if I told you “design an algorithm that sorts an array”, you’d better not start coding a solution right away. Among other things, you need to understand things like what’s in the array, and how large the array can get.
You're probably aware that interview problems have a bunch of constraints, which tell you how efficient your solution should be. What is less obvious is that interviewers don't always give you all the information needed. They expect you to be asking clarifying questions. Don't be alarmed if not everything is clear in a problem statement, this is your chance to make it so.
How do you figure out the right questions?
Generally speaking, think about all the things that matter to your solution. For example, make sure you know the minimum and maximum possible values for any key value in the statement. These may affect the performance of your solution directly. If there are arrays of values you need to know the range of these values, too. They can be numbers, chars, geometric figures, etc. Some of these should come naturally to you once you hear the statement. If it says that there are N numbers and nothing else is mentioned about N, ask how big it can be. Others could come when you start designing your solution. If a given value is important for the complexity of your solution, don't hesitate to ask about it. Never assume things on your own.
The Algorithm Design Canvas helps you collect the needed information at an interview. It has a section where you need to fill in all the important constraints in a problem. If you use it while practicing you will get into the habit of making sure you know everything that you need to design an efficient solution.
Filling in the Constraints box is nothing more than asking the interviewer the right questions. Every problem’s different, but to get you started we’ve compiled a pretty extensive list of common things you should keep an eye on. All of them are included in the Common Constraints Handout.
Get the Common Constraints Handout (PDF)
The handout above will help you get a good idea of what is important in most interview problems. In addition to that, you will have to practice with real problems to gain experience. For this you will be able to use the problems we recommend further in the course.
One of the goals of this course is to teach you how to solve new problems. This will be the topic of the current section. We believe that it is much better to learn how to design solutions instead of trying to cover all interview questions that exist. There are a few reasons for that.
First of all, new interview questions get created all the time. It is virtually impossible to be well-prepared for all of them. Second, even if you know the solutions to many problems you need to be able to analyze these solutions, have a discussion about them and be able to tweak them in case the interviewer decides to modify the question in some way.
One final reason is that employers are looking for people who can think of solutions on the spot. This is why it is very useful to learn how to be such a person.
Remember that practicing solving problems is the only way to really get in good shape. These strategies are one tool that you can use but you will become good with it only if you devote enough time applying it. Actually, with time you will notice various patterns in the problems and it will become much easier for you to crack them.
The following strategies are popular among people participating in programming contests. Since the algorithmic interview questions are very similar to these, we believe that they are very relevant here, too.
Simplify the task
“A map of streets is given, which has the shape of a rectangular grid with N columns and M rows. At the intersections of these streets there are people. They all want to meet at one single intersection. The goal is to choose such an intersection, which minimizes the total walking distance of all people. Remember that they can only walk along the streets (the so called "Manhattan distance").”
So, how can we approach this problem?
Imagine that we only have one street and people are at various positions on that street. They will be looking for an optimal place to meet on this street. Can you solve this problem? It turns out to be much easier than the 2D version. You just have to find the median value of all people’s positions on the street and this is an optimal answer. We’ll leave it to you to prove it.
Now, if we go back to the original problem we can notice that finding the X and Y coordinates of the meeting point are two independent tasks. This is because of the way people move - Manhattan distance. This leads us to the final conclusion that we have to solve two 1D problems and this will give us the final answer. As we already know how to solve the 1D case, we are done.
This strategy allows you to start thinking about a simpler version of the problem and to draw some conclusions about how to solve the original problem.
Try a few examples
We’ve noticed that candidates rarely test with examples other than the one given by the interviewer. Sometimes, however, this helps a lot. You may start noticing patterns if you try to solve a few sample inputs that you create. It is ok to tell the interviewer that you would like to try writing down a few examples in order to try to find some pattern. Then do it quickly and see where it leads you.
Here is a sample problem:
“There are N+1 parking spots, numbered from 0 to N. There are N cars numbered from 1 to N parked in various parking spots with one left empty. Reorder the cars so that car #1 is in spot #1, car #2 is in spot #2 and so on. Spot #0 will remain empty. The only allowed operation is to take a car and move it to the free spot.”
This problem is not hard but at first seems intimidating. Write down 5 different examples and try to order the cars on a sheet of paper. See if you can notice a pattern in how it happens.
Think of suitable data structures
For some problems it is more or less apparent that some sort of data structure will do the job. If you start to get this feeling think about the data structures you know about and try to apply them and see if they fit. For example you can consider the data structures we cover in the algorithmic topics section.
Here is an example question:
“Design a data structure, which supports several operations: insert a number with O(logN), return the median element with O(1), delete the median element with O(logN), where N is the number of elements in the data structure.”
Here it is pretty obvious that a data structure is involved. What could we use to solve the problem? Which data structure has similar characteristics? Things like arrays, stack and vectors are far from these requirements. Perhaps some sort of a binary tree could do as they usually support logarithmic insert and remove operations.
Heaps do that but they either return the minimum or maximum element, not the median. Hm, but this is very close, if only we could get the median. What if we use two heaps, one stores the one half smallest numbers and the other is for the other half biggest numbers. Then some number in the middle is the median. As mentioned above one of the heaps could hold the maximum and the other the minimum element. This should be enough to return the median in constant time. We will leave the details to you to figure out.
In summary, if you get a feeling that a data structure could be the solution to a problem you are given, think a bit about the ones you know. You may have to combine two or more to get to the correct solution.
Think about related problems that you know.
This is a simple idea and could help if nothing else helps. Since you’ve been practicing hard for your interviews you surely came across many different problems. If you see a problem and cannot think of a solution, try to remember another problem, which looks like it. If there is such, think if its solution can somehow be adjusted to work for the problem at hand. This can sometimes mislead you but many problems are related, so it could also get you out of the situation.
Read carefully these strategies and try to apply them to the problems you face. In this course there are many recommended problems and you can do that with all of them. The Algorithm Design Canvas will help you write down the ideas that come to you.
TopCoder tutorials also provide a useful resource on this topic, which you may find worth reading: How to Find a Solution.
If you have other strategies for solving interview problems we will be more than happy to hear about them.
- It is important to learn to solve any problem instead of knowing all of them by heart.
- There is a set of well-known strategies for approaching interview problems. We look at them and use some of them for an example problem.
Why is complexity so important?
Solving problems is not just about finding a way to compute the correct answer. The solution should work quickly enough and with reasonable memory usage. Otherwise, even the smartest solution is not useful. Therefore, you have to show the interviewer two things:
- Your solution is good enough in terms of speed and memory.
- You are capable of evaluating the time and memory complexity of various algorithmic solutions.
Usually, if you fail to evaluate these parameters of your solutions this will reduce your chances of passing a technical interview. The interviewer needs to know that if you get hired, you will be able to choose wisely the solutions that you implement.
https://www.hiredintech.com/classrooms/algorithm-design/lesson/82
We see many people who jump straight into coding, ignoring all previous steps. This is bad in real life, and it’s bad when done at your interview. Never, ever jump straight into coding before having thought about and discussed constraints, ideas and complexities with your interviewer.
The one thing about coding (other than rushing into it too quickly) that many people struggle with is that coding in your IDE is not the same as coding on a whiteboard / a shared document / using some online system (in short: “outside your IDE”). As engineers, we’ve become so accustomed to relying on our IDEs (which is not bad), that when presented with a blank sheet of paper, we are lost. While this may have been OK at your day job, at an interview it simply doesn’t work.
This is why you need special preparation for “interview coding”. Say hello to the big juicy “Code” section of the Canvas, waiting to be filled in. At first it may be a bit tedious to write your code on paper, but very quickly you get used to it and you become more efficient. And the great news is that being able to code on a sheet of paper will not only boost your interviewing skills - it will also make you a better engineer overall.
- Think before you code. Especially if you’re coding on a sheet of paper (where there’s no “undo”), if you’re not careful everything could become very messy very quickly
- Coding outside of an IDE does not give you permission to stop abiding by good code style. Make sure you name your variables properly, indent nicely, write clean code, etc. etc.
- It’s even more important to decompose your code into small logical pieces and NOT copy-paste
- Read your code multiple times before you claim it’s ready. You don’t have the luxury to compile, see if it compiles, run, see if it runs, and then debug for 4 hours. Your code needs to work off the bat.
- When should you start coding at the interview?
- Coding for an interview is not like coding in your IDE. How can you prepare for that best?
Once your code is written, you don’t just lift your pen, say “I’m ready” and leave. A great next step is to verify it with several small test cases. And what would be even better is if you could tell your interviewer how you would extensively unit test your code beyond these sample tests.
Showing that you know and care about testing is a great advantage. It demonstrates that you fundamentally understand the important part testing plays in the software development process. It gives the interviewer confidence that once hired, you’re not going to start writing buggy code and ship it to production, and instead you’re going to write unit tests, review your code, work through example scenarios, etc.
Sample tests vs Extensive tests
Before we go into what makes a good test case, we're going to make one clarification around testing at interviews. There are two kinds of testing you may be asked to do (or decide to do) at an interview.
We call the first kind "sample" tests: these are small examples that you can feed to your code, and run through it line by line to make sure it works. You may want to do this once your code is written, or your interviewer may ask you to do it. The keyword here is "small": not "simple" or "trivial".
Alternatively, your interviewer may ask you to do more "extensive" testing, where they ask you to come up with some good test cases for your solution. Typically, you will not be asked to run them step-by-step. It is more of a mental test design exercise to demonstrate whether you're skilled at unit testing your code.
Extensive testing
Let's start with the bigger topic: what makes for a good test case. If you were to compile an extensive set of tests for your solution, what should these be? Here is a good list to get you started.
- Edge cases: Remember that "Constraints" section that we filled in? It's going to come in very handy. Design cases that make sure the code works when the min and/or max values of the constraints are hit. This includes negative numbers, empty arrays, empty strings, etc.
- Cases where there's no solution: To make sure the code does the right thing (hopefully you know what it is)
- Non-trivial functional tests: these depend very much on the problem. They would test the internal logic of the solution to make sure the algorithm works correctly.
- Randomized tests: this makes sure your code works well in the "average" case, as opposed to only working well on human-generated tests (where there's inherent bias).
- Load testing: Test your code with as much data as allowed by the constraints. These test your code against being very slow or taking up too much memory.
A good "test set" is a well-balanced combination of the above types. It will include tests that cover most edge cases, a few non-trivial functional tests, and then a series of random tests. These will make sure the code is solid and functionally correct. Finally, some load tests make sure the algorithm works well on the largest and most resource-demanding tests.
Sample tests
Sample tests are small tests that you run your code on at the interview to make sure it is solid. Now that we've covered the major kinds of tests you may design, which ones should you use as sample tests?
Typically, we stay away from randomized tests and load tests during interviews, for obvious reasons. Instead, we like choosing a small-scale version of a non-trivial functional test, to make sure the code does the right thing. Then, we look at how the code would react to several edge cases, and finally think about whether the code would work well if no solution can be found.
This combination (non-trivial functional + edge + no solution) tends to be the most effective. For the amount of time it takes to design the tests and to run them on your code on a sheet of paper, it gives you the highest certainty in your code.
How to prepare
There are three good practice activities that will significantly improve your testing skills.
- Number one is to make sure you understand the fundamental kinds of algorithm tests (mentioned above).
- The second step is to pay attention to the test cases for the TopCoder problems. Each problem has multiple "open" test cases (part of the problem statement). Pay attention to what cases they’re trying to cover. Additionally, when you run the System tests you're shown all test cases picked by the problem authors. Make sure you review some of those too.
- The third step is to always fill in the "Test cases" section of the Algorithm Design Canvas. See what scenarios you can come up with.
No matter how you prepare, the key message here is to not ignore testing your solution. Reading your code once it’s written and trying it out with a well-chosen test case is going to do wonders in finding everything from silly typos to actual bugs. Choose the test cases carefully, so that this does not turn into a huge time sink. You will learn to do that with some practice.
Always follow the Canvas
Many people feel tempted to skip certain areas of the Canvas during practice. Sometimes they do this because they think they're saving time, sometimes because they are overconfident and think the problem is too easy.
This is not a good idea. The Canvas is designed to build up all aspects of your interviewing chops, as you never know what your interviewer will want to focus on. By systematically thinking about all areas during practice, when the rubber hits the road you'll be a lot more confident and, ultimately, more successful.
Forget about your IDE
Read through your code
We all know people whose natural style is to write some code and then immediately try to run it and see if it works on some set of test cases. They spend about 20% writing the code, and then maybe 80% debugging it. While this may work well for them in real life, it simply does not work at interviews. You don't have the luxury of a debugger, nor the luxury of something automatically executing your code.
Instead, what you should do is to carefully think about the code you write, and then once you've written it, to spend some time reviewing it. If you're using TopCoder, make sure you are 100% confident that your code is going to compile AND run correctly on the tests before actually hitting "Compile" and "Run Tests". This practice of carefully inspecting your code is going to be priceless at interviews.
Imagine that you are given a task to generate all palindromes using a set of
N
latin letters. One brute force solution that comes to mind is to generate all permutations of the N
letters and to check for each one of them if it's a palindrome. Some people could say that the task is easy because this solution is easy to come up with. However, to generate all permutations of the N
letters you would need to generate N!
words. For even small values of N
this becomes a solution, which will not finish in a reasonable amount of time. Because of that we will need to design a better approach, which can compute the answer much faster.
At tech interviews you will have to compute and explain the time and memory complexity of your solutions quite often. This is something that will show your interviewer that you can evaluate if a solution is feasible given the input size. In real life this is also a very useful skill. Actually, in our opinion, it's a must-have skill for good software engineers.
First of all, time complexity will be measured in terms of the input size. As you saw in the example above,
N
was the number of latin letters to use for building palindromes. We said that all permutations of these N
letters are N!
. So to generate them we would perform a number of steps that is proportional to N!
. Of course, it is also important how many steps it takes to generate each separate permutation. It is possible to do that with a constant amount of steps. We will not go into details how this can be done but trust us.
This means that the number of steps to generate each next permutation does not depend on the size of the input. For each generated permutation we would need to check if it is a palindrome. One way to do it is to compare the first and last letters, then the second and the last but one and so on. This will require a number of steps that is proportional to the number of letters -
N
. So, for each permutation we will perform that many steps.
With all that in mind we will have to perform a number of steps that is proportional to
N * N!
in order to execute our brute force solution. This number will be multiplied by some constant but usually when this constant is not too high we don't take it into account. Now that we have quickly analysed the number of steps required, we can clearly see that the number of steps will grow very quickly with increasing values of N
. If your interviewer tells you that N
can be as high as 100, then there is no use in even considering such a solution. And being able to describe to the interviewer why such a solution is not feasible is also a useful skill.- TopCoder's 2-part tutorial on computational complexity is a good place to start, it also covers several notations. From the first part the whole article is useful. You can perhaps skip the section “Finally, formal definitions” if you wish. From the second part, everything up to the section “The substitution method” is worth covering for the purpose of interviews. The rest is more or less optional.
- This tutorial from Cprogramming.com describes the matter in an easy to understand way
- And this article uses a lot of examples to give you a better intuition
- From MIT we have a great lecture handout, which has some formal definitions but also has some good examples