Discussion about Interview Question 4

We used this as the first question in CS interviews at LMH in December 2018. The question below is from this post.

Question:

Consider the following ten-digit number:
\( 7~1~0~6~4~1~9~9~0~2\). You should remove three of the digits so as to get the largest possible seven digit number. As a first step, you may want to try to delete two digits from the following five digit number to get the largest possible three digit number:
\(7~1~9~3~4\).
Now consider the more general question when you are given an \(n\) digit number and asked to delete \(k < n\) out of the \(n\) digits, so as to leave the largest possible \(n - k\) digit number. Explain your general strategy. Evaluate the number of steps requried to perform your strategy and try to minimise this number as much as you can.

Solution:

The answers in the two specific instances are \( 9~3~4 \) by removing the \(7 \) from the first position and the \(1 \) from the second position, and \( 7~6~4~9~9~0~2 \), by removing the \(1\) in the second position, the \(0\) in the third position and the \(1\) from the sixth position.
We will assume that the digits are \(0 \) to \( 9 \), though the procedure remains the same even when considering numbers represented in an arbitrary base \( b \). Here is python code that implements a correct procedure.
```
				
def delete_digits(digits, k):
	n = len(digits)

	output = [0]*(n-k)

	i = 0
	deleted = 0
	max_val = max(digits) + 1  # larger than any digit
	last_output_digit = max_val
	while i < n:
		if digits[i] > last_output_digit and deleted < k:
			deleted += 1
			if deleted == i:
				last_output_digit = max_val
			else:
				last_output_digit = output[i - deleted - 1]
		else:
			output[i - deleted] = digits[i]
			last_output_digit = digits[i]
			i += 1
	return output
				
			
```
We'll discuss below how we arrived at this procedure as well as its correctness. For now we can observe that every time we enter the while loop, we increment exactly one of deleted and i by exactly \(1\). Since both are \(0\) at the start of the loop and we know that deleted can be at most \(k \) and i can be at most \(n \), the while loop runs for at most \( n + k \) iterations. Since the number of operations performed in one iteration of the while loop is at most a constant (in this case about \(10 \) comparison or assignment operations), the total running time cannot be more than linear in \(n + k \), or really just linear in \(n \) as \( k < n \).

Discussion: This question is relatively straightforward, but there are still a few potential pitfalls. And coming up with the algorithm above requires some thought process and insight.

First, there were a few candidates, though not many, who instinctively suggested deleting all the \(0\)'s first, then all the \(1\)'s and so on until we get to the correct number of digits. This is quite natural as a first thought, but obviously doesn't work as the location of the digits being deleted is as important as their value. For instance if we started with the number \(9890\) and had to remove on digit, then we are better of removing the \(8\) to get \(990\) than removing the \(0\) to get \(989\).

Respond to this